
Overview
We are seeking a Senior Data Engineer with expertise in Python, SQL, and PySpark to design and optimize scalable data pipelines. You will work closely with data scientists, analysts, and engineering teams to build high-performance data solutions that support business decision-making.
Key Responsibilities
- Develop and maintain ETL/ELT data pipelines using PySpark and SQL.
- Optimize big data processing workflows on cloud platforms like AWS, Azure, or GCP.
- Design, implement, and maintain data lakes and data warehouses (e.g., Snowflake, Redshift, BigQuery).
- Ensure data quality, reliability, and performance through monitoring and validation.
- Work with Kafka or other streaming technologies for real-time data ingestion.
- Implement CI/CD pipelines for automated deployments of data infrastructure.
- Collaborate with cross-functional teams to define and improve data architecture.
Required Skills & Qualifications
✅ 5+ years of experience in data engineering.
✅ Strong proficiency in Python (data manipulation, automation, and API integration).
✅ Advanced SQL skills for complex queries and performance tuning.
✅ Hands-on experience with PySpark for distributed data processing.
✅ Experience with cloud data solutions (AWS Glue, EMR, Databricks, or similar).
✅ Knowledge of Airflow or other workflow orchestration tools.
✅ Familiarity with containerization (Docker, Kubernetes) is a plus.
Preferred Qualifications
- Experience with data governance, security, and compliance best practices.
- Knowledge of machine learning pipelines and MLOps.
- Familiarity with IaC tools like Terraform for managing cloud resources.
Why Join Us?
Opportunity to work on large-scale, cutting-edge data projects.
Hybrid/Remote flexibility.
Collaborative and growth-oriented work environment.
Would you like any customization for a specific industry or company type?
Job Type: Full-time
Pay: ₹508,197.12 - ₹2,857,206.91 per year
Benefits:
- Health insurance
Work Location: In person