DevOps Engineer
NoidaFull-TimeMid-levelDevOps
Responsibilities:
- Design, build, and develop/enhance state of art machine Learning system infrastructure (cloud and on-premise) core components and architect platforms to create, train and deploy ML models.
- Build operating dashboards and charts to track system errors, performance and enable root cause analysis.
- Identify gaps and evaluate relevant tools and technologies as needed to improve processes and systems, leveraging open-source and cloud computing technologies to build effective solutions.
- Collaborate with the AI team to drive ML projects from conception to completion and production monitoring.
Requirements:
- Bachelor's or above with a good academic background.
- 2-4 years of meaningful work experience in DevOps handling complex services.
- Strong troubleshooting skills to keep our services highly available.
- Strong expertise and experience with Google Cloud Platform (GCP), Docker, Kubernetes, CI/CD, and Jenkins.
- Extensive experience in designing, implementing, and maintaining infrastructure as code, preferably using Terraform.
- Create and maintain deployment manifest files for microservices using HELM.
- Having LLMOps or MLOps experience is a bonus.
- Strong expertise is required with deployment at scale on a Kubernetes cluster via HPA.
- Broad technical background and experience with architecture, design, and operations of cloud solutions and how to meet security compliance requirements.
- Monitoring system health, ensuring security, scalability, and reliability.
- Design, implement, and maintain observability, monitoring, logging, and alerting using tools like Prometheus, Grafana, Promtail, Loki, and Datadog.
