Overview
We are seeking a skilled and innovative Data Scientist to join our team, specializing in the development of AI agents, Retrieval-Augmented Generation (RAG) models, and cutting-edge technologies like LangChain and OpenAI. This is a hands-on role where you will leverage large language models (LLMs) to build intelligent systems that can handle complex tasks and provide actionable insights. You will work alongside engineers and product teams to design, deploy, and optimize AI models that integrate both structured and unstructured data.
Key Responsibilities:
AI Agent Design & Development:
- Design and develop intelligent AI agents using OpenAI GPT models, LangChain, and other advanced LLMs (e.g., GPT-4, future models like Gemini and Llama).
- Implement and optimize multi-agent systems that leverage RAG (Retrieval Augmented Generation) techniques to access and utilize external data sources for enhanced agent performance.
- Collaborate with the engineering team to build AI agents capable of completing complex tasks autonomously.
Machine Learning & Model Development:
- Develop, train, and fine-tune machine learning models for various applications, including natural language processing (NLP), text generation, and reinforcement learning.
- Implement RAG frameworks to enable models to retrieve and incorporate real time contextual information from external data stores (e.g., FAISS, Pinecone).
- Design and deploy LLM-based solutions for various business problems, optimizing them for efficiency and accuracy.
Data Preparation & Feature Engineering:
- Clean, preprocess, and transform large datasets from both structured (SQL/NoSQL) and unstructured (text, documents, etc.) sources.
- Engineer features that improve model performance, particularly in the context of AI agents and LLMs.
- Develop data pipelines that support continuous updates to the agents knowledge base and ensure that agents have access to up-to-date information.
Collaboration & Cross-Functional Support:
- Work closely with data engineers, product managers, and business teams to define project goals, build solutions, and integrate AI models into the company's product offerings.
- Communicate technical results and insights clearly to stakeholders, providing actionable recommendations based on data-driven insights.
Research & Innovation:
- Stay up-to-date with the latest advancements in AI, NLP, machine learning, and RAG techniques.
- Explore and experiment with emerging AI frameworks like LangChain, OpenAI API, and others to improve model performance and capabilities.
- Contribute to R&D efforts to develop novel AI solutions and enhance the current system.
Experience: Minimum of 2 years of experience in data science or machine learning, with a focus on traditional AI, advance AI, or similar fields.
Technical Skills:
- Proficiency in Python for data analysis, machine learning, and model development.
- Experience with OpenAI models (e.g., GPT-3, GPT-4) and understanding of their use cases and limitations.
- Hands-on experience with LangChain (LangChain Tools), Hugging Face models, LLM and building multi-agent systems using this framework.
- Strong experience with machine learning libraries (e.g., scikit-learn, TensorFlow, PyTorch).
- Knowledge of RAG (Retrieval-Augmented Generation) techniques and tools like FAISS, Pinecone, or other vector databases for information retrieval.
- Familiarity with data visualization tools (e.g., Matplotlib, Seaborn, Tableau) to present findings effectively.
- Solid understanding of NLP techniques such as text generation, sentiment analysis, and language modeling.
- Good to have: Experience with cloud platforms like AWS, Google Cloud, or Azure for deploying models in production.
Data Engineering & Database Knowledge:
- Experience working with both SQL (e.g., PostgreSQL, MySQL) and NoSQL (e.g., MongoDB, Cassandra) databases
- Understanding of data pipelines and ETL processes for handling large-scale datasets.
Communication & Problem-Solving:
- Strong written and verbal communication skills, with the ability to explain technical concepts to non-technical audiences.
- Excellent problem-solving and analytical abilities with a focus on applying machine learning techniques to real-world business challenges.
Desired Skills and Experience:
- Familiarity with deep learning frameworks such as Keras, TensorFlow, and PyTorch for advanced AI applications.
- Good to have: Experience with big data technologies like Hadoop or Spark for processing large datasets.
- Knowledge of reinforcement learning (RL) and decision-making models for autonomous AI systems.
- Familiarity with deploying models and applications in production environments using tools like Docker, Kubernetes, and CI/CD pipelines.
Education:
- Bachelor's degree in Computer Science, Data Science, Mathematics, Statistics, Engineering, or a related field (Master’s degree preferred)
Location: Mumbai City (Work from Office Only)
Availability: Candidate should be available to join Immediately or within 30 days.
Job Type: Full-time
Pay: ₹1,200,000.00 - ₹2,000,000.00 per year
Schedule:
- Day shift
Ability to commute/relocate:
- Mumbai, Maharashtra: Reliably commute or planning to relocate before starting work (Required)
Application Question(s):
- Are you available to join within 30 days post selection.
Experience:
- Data Science: 2 years (Required)
Work Location: In person