Responsibilities:

  • On a day-to-day basis, you will be responsible to -
  • Architect and lead the design of complex AI systems involving multi-agent orchestration, large-scale RAG pipelines, and production LLM infrastructure
  • Own end-to-end delivery of critical AI features from conception to production, including design docs, implementation, and rollout strategy
  • Drive technical direction for AI platform components: establish patterns, frameworks, and best practices across the engineering org
  • Build high-scale data pipelines that process millions of records, manage embeddings at scale, and optimize for cost and latency
  • Mentor SDE-II and junior engineers through code reviews, design discussions, and pairing sessions
  • Lead evaluation and safety initiatives: design robust eval frameworks, implement guardrails, and ensure AI quality at scale
  • Collaborate cross-functionally with Product, Data Science/Engineering, and Platform teams to shape roadmap and technical strategy
  • Optimize production systems for performance, cost, and reliability; troubleshoot complex production issues

Skill Set:

  • 5+ years of software engineering experience with 2+ years building production LLM/GenAI systems at scale
  • Expert-level Python skills including async programming, performance optimization, and production-grade testing
  • Deep hands-on experience with ML/LLM frameworks: PyTorch, Hugging Face, LangChain, LlamaIndex, or Ray
  • Proven experience building and scaling vector search systems (Elasticsearch, Pinecone, Weaviate, FAISS)
  • Strong system design skills: microservices, distributed systems, event-driven architectures (Kafka/SQS/Kinesis)
  • Production experience with Kubernetes, Docker, and cloud infrastructure (AWS preferred)
  • Expertise in LLM optimization: prompt engineering, fine-tuning, embeddings, RAG, token management, and inference optimization
  • Track record of technical leadership: driving architecture decisions, mentoring engineers, and shipping complex projects
  • Excellent communication: able to influence technical decisions and collaborate effectively across teams
  • Experience with high-throughput inference using Ray, Triton, vLLM, or TensorRT
  • Background in LLM evaluation (RAGAS, custom harnesses, human-in-the-loop workflows)
  • Deep knowledge of multi-agent systems and agentic workflows
  • Experience optimizing AI costs at scale (prompt caching, batch processing, model selection)
  • Contributions to open-source AI projects or published research
  • Prior experience in high-growth startups or building 0-to-1 AI products

Job Summary

CompanyClari
LocationBengaluru, India
TypeFull-Time
LevelMid-level
DomainAI / Data Science