Senior AI/ML Engineer (AI Platform)
Boston, MAFull-TimeSeniorSoftware Engineering
RESPONSIBILITIES:
- Design, build, and operate production AI systems and scaffolding around language models that power conversational, predictive, and generative capabilities across WHOOP products.
- Lead end-to-end AI system initiatives spanning problem definition, data flows, dataset design, evaluation harnesses, deployment, and iteration in close partnership with data science and product.
- Build and maintain pipelines for collecting, curating, and reshaping messy, multi-source data into high-quality, well-structured training and evaluation datasets for language model–based systems.
- Operationalize fine-tuning and evaluation workflows for large language models behind member-facing features such as WHOOP Coach and AI Support, including defining datasets, labels, and taxonomies that reflect real member needs.
- Develop tooling and frameworks that make experimentation, offline/online evaluation, and model deployment faster, safer, and more repeatable, including robust observability for AI features in production.
- Build and maintain feedback loops that connect real member interactions, offline evaluations, and training data updates so that models improve continuously based on real-world behavior.
- Mentor other engineers and data scientists, share best practices in applied AI/ML, and help elevate the overall technical bar of the AI Platform team.
QUALIFICATIONS:
- 3+ years of experience in applied machine learning, AI engineering, or ML-focused software engineering roles, including significant work in production environments.
- Hands-on experience building with modern language models (open-weight or API-based), including prompt design, fine-tuning, and rigorous evaluation.
- Solid working understanding of ML fundamentals (dataset construction, feature engineering, training workflows, evaluation metrics, experiment design) sufficient to make good engineering tradeoffs and partner effectively with data scientists.
- Familiarity with modern LLM training and alignment techniques such as supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning (RL), and how they influence data requirements, evaluation strategies, and system design in production.
- Proven track record building, shipping, and operating ML-powered systems end to end, from data pipelines (batch and/or streaming) that transform large datasets into usable training and evaluation sets to production deployments with inference optimization, observability, and lifecycle management.
- Strong proficiency in data manipulation and analysis, including working with messy, multi-source, and semi-structured data and translating product questions into well-defined datasets, labels, and evaluation splits.
- Familiarity with best practices for secure, privacy-aware AI and working with sensitive data.
- Excellent communication and collaboration skills, with the ability to influence across teams and drive alignment on technical direction.
