Traceable
Site Reliability Engineering Manager
Job Location
bangalore, India
Job Description
Responsibilities : - Ensure reliability of cloud-based distributed systems infrastructure and services built to seamlessly scale to 10s of billions of events per day. - Responsible for the availability, performance, monitoring, emergency response, and capacity planning of the Traceable cloud services and infrastructure. - Responsible for building and maintaining ultra-modern infrastructure for CI/CD and DevOps. - Responsible for debugging and solving production issues and escalations working with the rest of the engineering team. - Collaborated with product engineering teams across time zones on the design and operations of systems and services. Requirements : - Bachelor's or Master's degree in computer science. - 8 years of work experience in SRE and DevOps with modern cloud-native tech stack, distributed systems at massively large scale. - Strong experience with cloud-native technologies (AWS/GCP, microservices Containers, Kubernetes, etc) at scale. - Strong experience in designing and operating massively large-scale data systems (Kafka, NoSQL Databases, real-time streaming, etc). - Strong experience in streaming systems like Kafka streams or Flink. - Hands-on experience in setting up, automating, and continuously improving the deployment pipelines and CI/CD infrastructure. - Strong experience with Linux systems. - Strong Experience in infrastructure deployment/provisioning as code using modern tools (Terraform, Helm, Ansible, etc). - Good expertise in Java and Scripting. - Strong troubleshooting and debugging skills for production issues and escalations. - Experience working in a distributed team with different time zones. - A self-starter with the ability to work effectively in teams and fast-faced start set-up. - Excellent spoken/written communication. Nice to have : - Information security experience for modern SaaS companies in Application security, Cloud/Infrastructure security, and Shift-left security will be a plus. - You are passionate about running large-scale, multi-tenant distributed data systems for customers that expect a very high level of availability. - You enjoy the challenge of leading a critical service and engineering it to work effectively in any circumstance. You feel frustrated when things go wrong and act proactively to prevent the next incident from happening. - You are passionate about ultra-modern CI/CD and DevOps infrastructure that enables entire engineering and product teams to be highly productive and agile. - You are passionate about big data systems, thinking about how to make them run as smoothly as possible, and want to have a big influence on the architecture and operational design areas of platform infrastructure. (ref:hirist.tech)
Location: bangalore, IN
Posted Date: 11/28/2024
Location: bangalore, IN
Posted Date: 11/28/2024
Contact Information
Contact | Human Resources Traceable |
---|