
Overview
Pyspark Data Engineer
As a Pyspark developer you Must have 2+ years in Pyspark.
Strong programming experience, Python, Pyspark, Scala is preferred.
Experience in designing and implementing CI/CD, Build Management, and Development strategy.
Experience with SQL and SQL Analytical functions, experience participating in key business, architectural and technical decisions
Scope to get trained on AWS cloud technology Python Developer
- As a Python developer you must have 2+ years in Python / Pyspark.
- Strong programming experience, Python, Pyspark, Scala is preferred.
- Experience in designing and implementing CI/CD, Build Management, and Development strategy.
- Experience with SQL and SQL Analytical functions, experience participating in key business, architectural and technical decisions
- Scope to get trained on AWS cloud technology As a senior software engineer with Capgemini, you will have 3 + years of experience in Scala with strong project track record
Hands On experience in Scala/Spark developer
Hands on SQL writing skills on RDBMS (DB2) databases
Experience in working with different file formats like JSON, Parquet, AVRO, ORC and XML.
Must have worked in a HDFS platform development project.
Proficiency in data analysis, data profiling, and data lineage
Strong oral and written communication skills
Experience working in Agile projects. Data Modeler
Good knowledge and expertise on data structures and algorithms and calculus, linear algebra, machine learning and modeling.
Experience in data warehousing concepts including Star schema, snowflake or data vault for data mart or data warehousing
Experience using data modeling software like Erwin, ER studio, MySQL Workbench to produce logical and physical data models.
Knowledge of enterprise databases such as DB2/Oracle/PostgreSQL/MYSQL/SQL Server.
Hands on knowledge and experience with tools and techniques for analysis, data manipulation and presentation (e.g. PL/SQL, PySprak, Hive, Impala and other scripting tools)
Experience with Software Development Lifecycle using the Agile methodology.
Knowledge of agile methods (SAFe, Scrum, Kanban) and tools (Jira or Confluence)
Expertise in conceptual modelling; ability to see the big picture and envision possible solutions
Experience in working in a challenging, fast-paced environment
Excellent communication & stakeholder management skills. Design, develop, and optimize PL/SQL procedures, functions, triggers, and packages.
Write efficient SQL queries, joins, and subqueries for data retrieval and manipulation.
Develop and maintain database objects such as tables, views, indexes, and sequences.
Optimize query performance and troubleshoot database issues to improve efficiency.
Work closely with application developers, business analysts, and system architects to understand database requirements.
Ensure data integrity, consistency, and security within Oracle databases.
Develop ETL processes and scripts for data migration and integration.
Document database structures, stored procedures, and coding best practices.
Stay up-to-date with Oracle database technologies, best practices, and industry trends.
Skills: pyspark , python , scala data , modeler , plsql