Overview
Onsite: Hybrid,
3 days work from office- Whitefield.
Job Responsibilities and Requirements
1. Python Programming Expertise• Proficient in core Python and its advanced libraries such as NumPy, Pandas, Scikit-learn, TensorFlow, and PyTorch.• Familiarity with Python-based web frameworks, particularly Flask, for building REST APIs and web applications.• Ability to write clean, efficient, and reusable code following best practices.
2. Generative AI and LLMs• Strong knowledge of training and fine-tuning generative AI models such as GPT, T5, or similar large language models.• Familiarity with techniques like transfer learning, prompt engineering, and embeddings for LLM optimization.• Experience with neural engines or accelerators (e.g., Tensor Processing Units (TPUs), NVIDIA GPUs, or other AI chips).• Understanding of Advanced Generative Intelligence (AGI) concepts and their practical applications.
3. Machine Learning Operations (MLOps)• Hands-on experience with MLOps pipelines for automating and managing ML workflows.• Knowledge of model versioning, monitoring, and deployment tools like MLflow, Kubeflow, or DVC.• Proficiency in CI/CD pipelines for machine learning and integrating ML models into production environments.
4. Cloud Deployment and Infrastructure• Experience deploying machine learning models and applications on AWS (e.g., S3, EC2, Lambda, SageMaker, Elastic Beanstalk) and Azure (e.g., Azure ML, AKS, App Services).• Proficiency in managing cloud services for large-scaledeployment and ensuring high availability.
- Ability to configure and optimize infrastructure for cost efficiency and performance on cloud platforms.
- Strong knowledge of containerization using Docker and managing scalable microservices in production.
- Familiarity with serverless architectures and event-driven design for ML applications.
5. Kubernetes and Cloud Orchestration
- Strong understanding of Kubernetes for orchestrating containerized applications and models.
- Hands-on experience managing deployments in cloud-native environments (EKS on AWS, AKS on Azure).
- Experience with monitoring tools like Prometheus and Grafana for managing clusters.
6. Large-Scale Model Deployment
- Expertise in scaling generative models and optimizing for real-time use cases.
- Familiarity with distributed computing frameworks and parallelization techniques.
- Hands-on experience with tools like Hugging Face, LangChain, or DeepSpeed.
- Ability to optimize models for latency, throughput, and cost in production.
7. Additional Skills
- Solid grasp of neural network architectures (CNN, RNN, Transformer-based models).
- Excellent debugging and optimization skills for large model inference.
- Familiarity with low-level hardware acceleration or neural processing engines.
Soft Skills
- Strong problem-solving and analytical skills.
- Ability to work collaboratively in a cross-functional team.
- Excellent written and verbal communication skills for explaining technical concepts
Job Type: Full-time
Schedule:
- Monday to Friday
Experience:
- total work: 5 years (Preferred)
Work Location: In person