Pune, Maharashtra, India
Information Technology
Full-Time
Cummins Inc.
Overview
Description
Although the role category specified in the GPP is Remote, the requirement is for Hyrid.
Key Responsibilities
Qualifications :
Knowledge/Skills
Skills & Experience :
This role requires collaboration with stakeholders in the US, with an expected overlap of 2-3 hours during EST working hours on an as-needed basis.
Job Systems/Information Technology
Organization Cummins Inc.
Role Category Remote
Job Type Exempt - Experienced
ReqID 2412312
Relocation Package No
Although the role category specified in the GPP is Remote, the requirement is for Hyrid.
Key Responsibilities
- Design & Development : Design and automate the deployment of distributed systems for ingesting and transforming data from various types of sources (relational, event-based, unstructured).
- Data Quality & Integrity : Design and implement frameworks for continuously monitoring and troubleshooting data quality and integrity issues.
- Data Governance : Implement data governance processes, ensuring effective management of metadata, access, and retention for both internal and external users.
- ETL Pipelines : Design and provide guidance on building reliable, efficient, and scalable data pipelines that integrate data from diverse sources using ETL/ELT tools or custom scripting.
- Database Optimization : Develop and implement physical data models and optimize database performance using efficient indexing and table relationships.
- Cloud Data Solutions : Create and manage large-scale data storage and processing solutions using cloud-based platforms like Azure Databricks, Data Lakes, Hadoop, and NoSQL databases (e.g., Cassandra, MongoDB).
- Automation & Productivity : Leverage modern tools and techniques to automate repeatable data preparation and integration tasks, minimizing manual efforts and error-prone processes.
- Agile Methodologies : Participate in agile development methodologies such as DevOps, Scrum, and Kanban to ensure timely delivery of critical analytics initiatives.
- Mentorship & Collaboration : Mentor junior developers, collaborate with cross-functional teams, and contribute to the overall success of the data platform.
Qualifications :
Knowledge/Skills
- Proven track record in developing efficient data pipelines and mentoring junior developers.
- Hands-on experience with Spark (Scala/PySpark), SQL, and Spark Streaming.
- Proficient in troubleshooting and optimizing batch/streaming data pipeline issues.
- Expertise in Azure Cloud Services (Azure Databricks, ADLS, EventHub, EventGrid, etc.).
- Strong understanding of data models (SQL/NoSQL), including Delta Lake or Lakehouse.
- Experience with CI/CD tools for automating deployments.
- Knowledge of big data storage strategies, performance optimization, and database indexing.
- Familiarity with Agile software development methodologies.
- Understanding of the machine learning lifecycle and experience integrating ML models into data pipelines.
- Exposure to open-source big data technologies and IoT.
- Familiarity with building analytical solutions in cloud environments.
- Experience in large file movement and data extraction tools.
- A degree in Computer Science, Engineering, Information Technology, or a related field, or equivalent relevant experience is required.
- Additional certifications in Azure, Spark, or cloud-based data engineering solutions are a plus.
- System Requirements Engineering : Ability to translate stakeholder needs into verifiable system requirements, ensuring alignment with project goals.
- Collaboration : Strong ability to build partnerships and work effectively within cross-functional teams to achieve shared objectives.
- Effective Communication : Skilled in delivering clear communications to diverse audiences, both technical and non-technical.
- Customer Focus : Dedicated to building strong customer relationships and delivering solutions that meet their needs.
- Problem Solving : Proficient in using systematic analysis and industry-standard methodologies to solve complex technical challenges.
- Data Quality : Knowledgeable in identifying, understanding, and correcting data quality issues across operational processes.
- Solution Documentation & Testing : Thorough in documenting solutions and validating them through structured testing practices to ensure they meet business requirements.
- Decision Making : Able to make timely, data-driven decisions that maintain project momentum.
Skills & Experience :
- Experience :
- 6-8 years of hands-on experience in data engineering, with a focus on building data pipelines, and working with cloud-based data solutions (preferably Azure Databricks).
- Advanced knowledge in Spark (Scala/PySpark), SQL, and cloud platforms like Azure.
- Familiarity with the design, development, and maintenance of large-scale data storage solutions (Hadoop, NoSQL databases, Data Lakes).
- Experience in mentoring junior developers and working in Agile development teams.
- Technical Skills :
- Advanced proficiency in SQL and Spark.
- Expertise in data pipeline design and automation.
- Knowledge of cloud data services (Azure Databricks, ADLS, EventHub, EventGrid).
- Experience with CI/CD tools for pipeline deployment automation.
- Familiarity with big data tools like Hive, Kafka, Hbase, and the use of Delta Lake.
- Experience in building and optimizing ETL/ELT pipelines.
This role requires collaboration with stakeholders in the US, with an expected overlap of 2-3 hours during EST working hours on an as-needed basis.
Job Systems/Information Technology
Organization Cummins Inc.
Role Category Remote
Job Type Exempt - Experienced
ReqID 2412312
Relocation Package No
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in