We are seeking a Generative AI Developer to design, build, and scale next-generation AI systems. You will go beyond simple API integration to architect RAG (Retrieval-Augmented Generation) pipelines, fine-tune LLMs (Large Language Models), and develop Agentic workflows where AI can autonomously handle multi-step tasks. You will be responsible for the "System" around the model—ensuring reliability, cost-efficiency, and ethical safety.
Responsibilities :
- Agentic Orchestration: Design and implement AI agents that use tools (APIs, databases) to solve complex, multi-step business problems.
- RAG Architecture: Build and optimize high-performance RAG pipelines using vector databases (e.g., Pinecone, Weaviate, or Milvus) to provide AI with long-term memory and factual grounding.
- Model Fine-Tuning: Customize pre-trained models (like Llama 3, GPT-4, or Claude) using techniques like LoRA and QLoRA for domain-specific accuracy.
- Prompt Engineering: Develop advanced prompt strategies (Chain-of-Thought, Few-Shot) and version-control them as first-class software artifacts.
- Evaluation & Observability: Build "Eval" frameworks to measure model performance, hallucination rates, and latency to ensure production-grade reliability.
- LLMOps & Deployment: Collaborate with DevOps to containerize (Docker/Kubernetes) and deploy models on cloud platforms (AWS Bedrock, Azure AI, or Google Vertex AI).
Required Technical Skills;
- Programming: Mastery of Python (FastAPI, PyTorch, TensorFlow).
- Frameworks: Proficiency in LangChain, LlamaIndex, or Haystack.
- Vector Databases: Experience with Pinecone, FAISS, or ChromaDB.
- Model Expertise: Hands-on experience with LLMs (OpenAI, Anthropic) and Open-Source models (Mistral, Llama).
- Data Engineering: Ability to build pipelines for data cleaning, chunking, and embedding.
- Cloud Platforms: Familiarity with AI services on AWS, GCP, or Azure
Read less
Employer dashboard | Cutshort
How likely are you to recommend Cutshort to ot