Site Reliability Engineer
Israel - Kiryat OnoFull-TimeMid-levelDevOps
What You'll Do
- Our mission is to protect, provide for, and progress the software and systems behind all of our Veeva Crossix services
- Ensure high uptime and reliability of Crossix’s production environments on AWS
- Take responsibility for managing the production environment, security, change management, deployment, architecture, and tools
- Perform root cause analysis for complex failures and offer modern solutions and tools
- Develop effective dashboards that provide key insights and system performance
- Analyze performance and stability issues and create in-house automation tooling
- Work closely with DevOps, R&D, product, and integration managers to enable automated CI/CD methods
- Analyze cloud infrastructure and application costs and raise ideas for cost saving
- Design, develop, and drive troubleshooting & mitigation tools as part of driving self-healing agenda
- Constantly improve the technology stack, supporting the data growth
- Work in a Big Data company with cutting-edge technologies
Requirements
- 3+ years of experience as SRE / DevOps in a production environment
- 2+ years of experience with scripting languages
- 2+ years of hands-on experience with cloud services
- 2+ years of experience with Infrastructure as Code (IaC) tools such as AWS CloudFormation, Terraform, or similar
- 2+ years of experience with CI/CD systems such as GitHub Actions and Jenkins
- 1+ years of experience with containerization and managing Kubernetes clusters
- 1+ years of experience with Linux
- 1+ years of experience with common networking, firewall, and load balancing protocols
- 1+ years of experience with SQL and relational database administration
- Team player
- Fast Learner
- High communication skills
Nice to Have
- Managing noSQL databases such as ElasticSearch and MongoDB
- Experience working with BigData tools such as Spark
- Understanding of application security in cloud environments
