Overview
Suhora is a cutting-edge technology firm that leverages satellite imagery, big data, and AI to solve problems surrounding Earth. We specialize in offering integrative all-weather, day-and-night solutions by combining Synthetic Aperture Radar (SAR), Optical, and Thermal data. Our mission is to utilize technology and our expertise in geospatial intelligence to make this world a better and more sustainable place.
At Suhora, we are committed to delivering innovative solutions that help our clients make data-driven decisions, whether it’s for environmental monitoring, agriculture, disaster response, or infrastructure development. We believe that our expertise can make a tangible difference in addressing global challenges, and we’re looking for individuals who are passionate about technology and sustainability to join our team.
For more detailed information, visit our website: www.suhora.com
Job Summary:We are seeking a Machine Learning Engineer with a focus on Large Language Models (LLMs), Natural Language Processing (NLP), and advanced techniques such as LLAMA and Retrieval-Augmented Generation (RAG). The ideal candidate will have hands-on experience in leveraging cutting-edge LLMs, including Meta's LLAMA models, and applying RAG to develop powerful AI systems. You will work on projects that combine NLP techniques with geospatial data, building systems that can process, analyze, and generate insights from geospatial intelligence applications.
Responsibilities:- Develop LLMs & NLP Models:
- Design, develop, and fine-tune LLAMA models, RAG architectures, and other LLM techniques (e.g., GPT, BERT) to process and generate text-based insights from geospatial data, reports, and metadata.
- Build and integrate NLP models capable of performing information retrieval, extraction, and classification tasks from geospatial data reports and documents.
- Implement Retrieval-Augmented Generation (RAG):
- Design and implement RAG systems that enhance the performance of LLMs by integrating external knowledge sources for generating context-aware, accurate, and useful results.
- Fine-tune LLAMA or other LLMs with RAG architectures to provide responses based on external retrieval of relevant information from large, unstructured datasets.
- Text Analytics & Information Extraction:
- Implement advanced NLP techniques for extracting key insights from unstructured geospatial text, such as location-based data, satellite data descriptions, and environmental reports.
- Integrate LLMs with structured geospatial data, such as GeoTIFFs, shapefiles, or GIS data formats, to provide actionable insights.
- Model Training and Optimization:
- Train large-scale LLAMA models and RAG systems to handle diverse text datasets and optimize them for performance in real-time applications.
- Ensure that models are efficient, scalable, and capable of processing massive volumes of geospatial-related text data.
- Cross-Functional Collaboration:
- Work closely with data scientists, software engineers, and domain experts to integrate NLP and LLM models into production pipelines.
- Continuously contribute to R&D efforts, exploring and implementing new advancements in LLAMA, RAG, and NLP technologies.
Experience:
- 3-5 years of experience in machine learning, NLP, and LLMs, with specific hands-on experience working with LLAMA models or similar LLM technologies (e.g., GPT, BERT).
- Experience in implementing Retrieval-Augmented Generation (RAG) techniques for improving model performance.
Technical Expertise:
- Strong proficiency in Python, and experience with NLP libraries and frameworks such as Hugging Face Transformers, spaCy, PyTorch, TensorFlow, or Torch.
- Hands-on experience with LLAMA (Meta AI) or similar transformer-based models.
- Proficiency in text processing techniques, including tokenization, named entity recognition (NER), sentiment analysis, and semantic search.
Problem Solving & Collaboration:
- Ability to analyze complex NLP tasks and translate them into scalable and efficient model solutions.
- Strong collaboration skills to work with engineers and domain experts in delivering end-to-end NLP solutions.
- Experience with geospatial data or knowledge of geospatial intelligence.
- Familiarity with cloud platforms for model deployment (AWS, Google Cloud, Azure).
- Knowledge of data augmentation, model fine-tuning, and reinforcement learning techniques.
- Experience with Docker and Kubernetes for deploying machine learning models.
Bachelor’s or Master’s degree in Computer Science, Data Science, Artificial Intelligence, or related fields.