What’s in it for you?

  • Own the end-to-end qualification lifecycle for AI/LLM systems from ideation and implementation to CI/CD integration.
  • Design and implement scalable automated test suites across unit, integration, regression, and system levels.
  • Build and enhance frameworks to test, evaluate, and continuously improve complex AI and LLM workflows.
  • Lead the design and automation of LLM-powered features, including prompt pipelines, RAG workflows, and AI-assisted developer tools.
  • Develop evaluation pipelines to measure factual accuracy, hallucination rates, bias, robustness, and overall model reliability.
  • Define and enforce metrics-driven quality gates and experiment tracking workflows to ensure consistent, data-informed releases.
  • Collaborate with agile engineering teams, participating in design discussions, code reviews, and architecture decisions to drive testability and prevent defects early (“shift left”).
  • Develop monitoring and alerting systems to track LLM production quality, safety, and performance in real time.
  • Conduct robustness, safety, and adversarial testing to validate AI behavior under edge cases and stress scenarios.
  • Continuously improve frameworks, tools, and processes for LLM reliability, safety, and reproducibility.
  • Mentor junior engineers in AI testing, automation, and quality best practices.
  • Measure and improve Developer Experience (DevEx) through tools, feedback loops, and automation.
  • Champion quality engineering practices across the organization, ensuring delivery meets business goals, user experience, cost of operations etc.

We’d love to hear from you, if you:

  • LLM testing & evaluation tools: MaximAI, OpenAI Evals, TruLens, Promptfoo, LangSmith
  • Building LLM-powered apps: prompt pipelines, embeddings, RAG, AI workflows
  • CI/CD design for application + LLM testing
  • API, performance, and system testing
  • Git, Docker, and cloud platforms (AWS / GCP / Azure)
  • Bias, fairness, hallucination detection & AI safety testing
  • Mentorship and cross-functional leadership

Job Summary

CompanyMindtickle
LocationPune, Maharashtra
TypeFull-Time
LevelMid-level
DomainOther