You will be redirected to the company career page

RESPONSIBILITIES

  • Work with cross-functional teams across product, development compliance and security to understand the roadmap and translate to scalable system design
  • Define metrics for measuring our service availability and performance
  • Proactively building and implementing tools and services to make developers and tech support better at their jobs
  • Ensure and promote security, high availability/zero downtime and scalability in all organizational and team implementations
  • Critical Path Analysis and SPOF Analysis
  • Develop Run-Books for thorough service feature documentation and troubleshooting. e.g. when a remittance failure happens, enable faster MTTR
  • Organize training for Engineers to become part of On-call teams
  • Build and maintain AI powered tools for internal knowledge based enhancement and applications like root cause analysis, using techniques like RAG
  • Build AI powered tools that act as Co-Pilots for our Customer Support (CS) Agents to offer best-in-class support
  • Ensure feature implementations are low maintenance using continuous integration and continuous deployment methods with a focus on Automated Test feedback and rollback
  • Identify existing open source / proprietary tools that can solve business problems and evaluate them to help make the build or buy decision
  • Set the vision for Engineering Excellence in our Production Systems and Inspire action from Fullstack teams towards implementing the right design and monitoring standards
  • Use your strong knowledge of how Infrastructure and Application development interact, to switch comfortably between Architecture and Hands-on feature development work as needed
  • Improvise the Production Incident Management system and ensure that actionable insights are derived from live site incident RCAs/PostMortem sessions
  • Be a frontline person during Live Site Incidents and stay calm under pressure to drive collaborative resolution

QUALIFICATIONS

  • B.S. or M.S. Computer Science and 6+ years in software development experience
  • Strong software development fundamentals (Data structures, Algorithms, problem-solving, OO design, and systems architecture).
  • Strong understanding of object-oriented software development
  • Understanding of large and complex code bases, including API design techniques to help keep them clean and maintainable.
  • Experience with handling Live production support on large scale systems
  • Proficiency in 1 statically typed and 1 dynamically typed language and good knowledge of frameworks like Spring, Hibernate, Nodejs Express etc.
  • Knowledge of multithreading and memory management specific to mobile devices and caching mechanisms
  • Has passion for delivering excellence in software systems with a can-do attitude.
  • Experience on implementing observability platforms using any of products suites like Sumologic, Datadog, NewRelic, ELK, Prometheus
  • Good to have Experience with infrastructure automation and monitoring tools- Terraform, Helm, Ansible, Puppet, Chef, etc

Job Summary

CompanyNium
LocationMumbai
TypeFull-Time
LevelStaff
DomainSoftware Engineering