Verna, Goa, India
Information Technology
Full-Time
Sii Poland
Overview
We are looking for a highly skilled and dedicated Data Engineer to join our AI solutions development squad. This team will be responsible for building cutting-edge applications leveraging Large Language Models (LLMs), covering the entire development lifecycle from concept to deployment and operations. The Data Engineer will design, build, and maintain robust data infrastructure to support AI applications. The ideal candidate will have expertise in structured and unstructured data, vector databases, real-time data processing, and cloud-based AI solutions using AWS or Azure. Experience with Galileo is essential to ensure seamless data integration and optimization.
Your tasks
Benefits For You
Diverse portfolio of clients
Wide portfolio of technologies
Employment stability
Remote work opportunities
Contracts with the biggest brands
Great Place to Work Europe
Many experts you can learn from
Open and accessible management team
Your tasks
- Collaborating with AI engineers, data scientists, and product owners to integrate LLMs into scalable, robust, and ethical applications
- Designing and implementing scalable, high-performance data pipelines for AI/GenAI applications, ensuring efficient data ingestion, transformation, storage, and retrieval
- Managing vector databases (e.g. AWS OpenSearch, Azure AI Search) to store and retrieve high-dimensional data for Generative AI workloads
- Building and maintaining cloud-based data solutions using AWS (OpenSearch, S3) or Azure (Azure AI Search, Azure Blob Storage)
- Developing ETL/ELT pipelines to enable real-time and batch data processing
- Optimizing data storage, retrieval, and processing strategies for efficiency, scalability, and cost-effectiveness
- At least 4 years of experience with programming languages focused on data pipelines, such as Python or R
- Minimum 4 years of experience working with SQL
- Over 3 years of experience in data pipeline maintenance, including handling various types of storage (filesystem, relational, MPP, NoSQL) and working with structured and unstructured data
- Minimum 3 years of experience in data architecture concepts, including data modeling, metadata management, workflow management, ETL/ELT, and real-time streaming
- More than 3 years of experience with cloud technologies emphasizing data pipelines (e.g., Airflow, Glue, Dataflow, Redshift, BigQuery, Lambda, S3) and familiarity with Galileo for seamless data integration
- Strong problem-solving skills and ability to develop innovative solutions for AI-driven data workflows
Benefits For You
Diverse portfolio of clients
Wide portfolio of technologies
Employment stability
Remote work opportunities
Contracts with the biggest brands
Great Place to Work Europe
Many experts you can learn from
Open and accessible management team
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in