Bangalore, Karnataka, India
Information Technology
Other
Cleartax

Overview
ID: 1801 | 2-5 yrs | Bengaluru | careers
We are seeking a talented Data/Software Engineer II with expertise in big data processing and ETL pipeline management. Additionally, a solid background in software engineering including building scalable & performant web systems with clear focus on reusable modules.
Key Responsibilities:
- Design, develop, and maintain ETL pipelines to process large-scale datasets efficiently and reliably.
- Build and optimize Spark-based data pipelines to perform transformations and aggregate data for analytics and machine learning applications.
- Implement AWS Glue jobs to support data ingestion, transformation, and integration across various data sources.
- Leverage Apache Iceberg for efficient data storage, management, and querying, with a focus on performance and scalability.
- Utilize Airflow to orchestrate complex workflows and ensure the timely and efficient execution of data processing jobs.
- Implement Change Data Capture (CDC) processes to capture real-time changes from source systems and integrate them into downstream data systems.
- Build scalable and efficient ETL solutions that maintain high data quality and data governance standards.
- Develop, test, and deploy web services for data access APIs, integrating data pipelines with other applications.
- Ability to translate fuzzy business problems into technical problems & come up with design, estimates, planning, execution & deliver the solution independently.
- Use MySQL, MongoDB, and other database technologies to store and retrieve data as needed for ETL processes and web services.
Required Skills and Qualifications:
- 2-4 years of experience as a data/software engineer or in a related role, preferably in a fast-paced and data-intensive environment.
- Strong experience with Spark for batch and real-time data processing, including writing and optimizing Spark jobs.
- Strong problem-solving skills and a proactive approach to tackling complex data challenges.
- Knowledge of Apache Kafka or similar streaming technologies for real-time data processing.
- Understanding of distributed systems and cloud-based architectures.
- Strong expertise in - coding, data structures, algorithms, low-level class & DB design, high-level system design, and architecting for high scale using distributed systems.
- Excellent communication and collaboration skills to work effectively within cross-functional teams.
- Experience with CDC techniques and tools to capture data changes in real-time.
- Experience with SQL and OLAP data stores.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in