U-SET

Senior Site Reliability Engineer - DevOps

Click Here to Apply

Job Location

in, India

Job Description

Job Description : - Deep understanding of SRE principles and experience in anomaly detection, root cause analysis, and predictive maintenance. - Working Knowledge on Automation first approach, defining SLI/SLO/Error Budgets - Experience in leading an operations team in Application Production Environment - Experience in Scripting Languages (Java, Python, PowerShell, VBScript) - Working knowledge of Kubernetes and Opentelemetry - Knowledge on the Generative AI concepts, LLM fundamentals and Responsible AI concepts - Knowledge of DevOps methodologies, tools and automation -CICD pipelines, tools (GitHub, Terraform, ArgoCD, Helm etc) and infrastructure automation - Experience in working with Public / Private cloud (AWS, Azure, GCP, Rancher etc.,) - Proficiency in incident response, change and release process, application monitoring, and platform optimization. - Define and implement effective observability solutions to proactively identify and resolve issues and drive optimisation - Define and manage incident process, change and release management process, deployment process, on-call and escalation process. - Develop automation (IaC, Alert as code, dashboard as code etc) to increase efficiency and reduce toil - Conduct POC to implement tools and solutions to support Generative AI application platform - Analyse operational performance (Incidents, Problems and Alerts trends) and drive optimisation - Follow and implement SRE best practices and standards within the team - Document SOPs, processes, critical system information, KB articles, POCs, standards and best practices for current and future references - Provide technical guidance and mentorship to junior SRE team members - Stay updated with the latest advancements in Generative AI space What you bring to the team : - Experience in SRE principles & best practices to manage on-premises and cloud applications - Working knowledge on the Generative AI applications - Ability to lead the team for continuous improvement, estimate work and escalate issues on time - Strong analytical skills to identify and resolve complex technical issues to ensure system reliability and minimize downtimes - Strong communication and interpersonal skills to effectively collaborate with cross-functional teams. (ref:hirist.tech)

Location: in, IN

Posted Date: 11/27/2024
Click Here to Apply
View More U-SET Jobs

Contact Information

Contact Human Resources
U-SET

Posted

November 27, 2024
UID: 4887973071

AboutJobs.com does not guarantee the validity or accuracy of the job information posted in this database. It is the job seeker's responsibility to independently review all posting companies, contracts and job offers.