Bangalore, Karnataka, India
Information Technology
Full-Time
Acme Services
Overview
Job Description
- Big Data Infrastructure & Pipelines:
- Architect, develop, and optimize scalable data pipelines using Apache Spark, Hadoop, Apache Flink, and similar technologies for handling petabyte-scale datasets.
- Build and maintain real-time data streaming pipelines using Apache Kafka, Apache Pulsar, or AWS Kinesis for processing low-latency, high-volume financial market data.
- Implement batch data processing systems using Apache Spark and MapReduce for high-throughput, fault-tolerant data analytics.
- Streaming Data Pipeline Experience - Real-time and Near realtime data processing experience.
- Cloud Architecture:
- Design and deploy cloud-native data solutions on AWS (S3, EMR, Lambda, Glue, Redshift), Google Cloud Platform (GCP) (BigQuery, Dataproc, Dataflow), or Microsoft Azure (Azure Data Lake, Synapse Analytics).
- Use infrastructure-as-code tools such as Terraform or AWS CloudFormation to automate the provisioning and management of cloud resources.
- Ensure cloud cost optimization and scalability for large-scale data processing tasks.
- ETL Development:
- Develop robust ETL (Extract, Transform, Load) pipelines using tools like Airflow, AWS Glue, or Dataflow, ensuring efficient ingestion and transformation of structured and unstructured data from multiple data sources.
- Clean, transform, and enrich financial datasets (e.g., stock prices, options data, market feeds) for downstream analysis and machine learning model training.
- Data Storage & Management:
- Implement and manage data lakes and data warehouses (e.g., AWS Redshift, Google BigQuery, Snowflake) for storing large datasets with optimal querying performance.
- Design database schemas and optimize performance for both OLAP and OLTP workloads.
- Collaboration & Optimization:
- Collaborate with data scientists, analysts, and other engineers to ensure the data platform supports advanced analytics, machine learning, and reporting requirements.
- Continuously monitor data pipelines for performance bottlenecks and optimize for speed, scalability, and cost efficiency.
- Data Governance & Security:
- Implement data governance policies to ensure data quality, accuracy, and consistency.
- Ensure data security and compliance with relevant regulations, especially when dealing with sensitive financial data.
- Monitor and audit data access, implementing role-based access control and encryption standards.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in