Sr. Software Engineer, Machine Learning Infrastructure
Palo Alto, CaliforniaFull-TimeMid-levelAI / Data Science
What You'll Do:
- Build and evolve robust, scalable ML infrastructure that supports ML engineers across all Tinder business domains
- Set and drive the long-term technical direction for Tinder’s ML infrastructure
- Design, build, and operate production-grade ML serving infrastructure for ML models using Ray Serve and Triton
- Develop and maintain robust serving infrastructure specialized for serving large language models (LLMs) in-house
- Develop efficient ML serving platform using Ray Serve and Triton
- Build the foundation of Tinder’s feature store using Databricks and internal tooling
- Own infrastructure projects end to end—from design and implementation to adoption and measurable impact.
- Partner closely with ML Engineers, ML Software Engineers, and CloudOps to ensure infrastructure directly enables better models and faster iteration
- Establish and propagate best practices in ML infrastructure, data engineering, and model serving
- Mentor and support junior engineers, raising the technical bar across the team
What You'll Need:
- Bachelor’s degree in Computer Science, Engineering, Technology, or a related field.
- 5+ years of experience building or operating ML platforms, including training, serving, feature management, or experimentation systems.
- Hands-on experience designing, building, or running feature stores at scale.
- Strong software engineering fundamentals, with proficiency in Python and at least one of Java, Scala, Go, or a similar language.
- Practical experience with ML serving platforms such as Triton, Ray Serve, or Seldon.
- Solid grasp of core machine learning concepts, including model training, evaluation, validation, and performance measurement.
- Proven ability to lead cross-functional initiatives and work effectively across ML, infrastructure, and product teams
- Deep experience in distributed systems, cloud infrastructure, and MLOps, with hands-on exposure to transformers and modern deep learning architectures
- Ability to bridge the gap between cutting-edge ML research and reliable, production-grade systems
