Senior Machine Learning Platform Engineer
New York, New YorkFull-TimeSeniorAI / Data Science
Responsibilities
- Own or contribute to technical designs for the feature platform, training platform, serving platform, AI observability platform and GenAI Platform and underlying operational infrastructure to enable Product impact.
- Develop, maintain, and enhance frameworks for AI/ML model development and deployment while establishing and driving best practices in MLOps / GenAI engineering.
- Design, advocate, and implement for usability, reliability, scalability, operational excellence, and cost management while delivering incrementally.
- Collaborate closely with ML Engineers, Data Scientists, Data Engineers and Product Managers to understand their needs and identify opportunities to accelerate the AI/ML development and deployment process.
- Mentor and educate ML Engineers, Backend Engineers and Product Managers on current and up and coming tools and technologies for ML operations & GenAI product development through presentations and documentation.
- Help design and architect an AI platform that adheres to the principles of responsible AI (Authenticity, Transparency, Equity) and builds privacy compliance into the platform.
- Lead build vs buy discussions on technologies that underpin the Platforms we serve.
- Participate in on-call and Incident Management processes.
What We're Looking For
- 4+ years of experience, depending on education, as an ML, backend, data, or platform engineer developing and working with large scale, complex systems.
- 2+ years of experience working on a cloud environment such as GCP, AWS, Azure, and with dev-ops tooling such as Kubernetes
- 1+ year of experience leading projects with at least 1 other team member through completion.
- 1+ year of experience for Senior designing and developing online and production grade ML systems.
- A degree in computer science, engineering, or a related field.
- Strong programming skills: Proficiency in languages like Python, Go, or Java.
- System design & architecture: Ability to design scalable and efficient ML systems.
- Cloud platform proficiency: The ability to utilize cloud environments such as GCP, AWS, or Azure.
- ML knowledge: A basic understanding of ML algorithms, techniques, and best practices.
- Data engineering knowledge: Skills in handling and managing large datasets including, data cleaning, preprocessing, and storage
- Collaboration and communication skills: The ability to work effectively in a team and communicate complex ideas clearly with individuals from diverse technical and non-technical backgrounds..
- Strong written communication: The ability to communicate complex ideas and technical knowledge through documentation
- Software leadership skills: A track record of leading projects through completion with quantifiable and measurable outcomes.
