Arting Digital
Site Reliability Engineer
Job Location
bangalore, India
Job Description
Posting title : Site Reliability Engineer Experience : 7 Years Location : Bangalore Work mode : WFO Primary skills : Cloud Monitoring & Operations (GCP & Azure), Python, ServiceNow Qualification : Any Engineering/ Computers degree Roles & Responsibilities : Daily Operations & Monitoring : - Actively monitor systems, applications, and infrastructure across cloud environments (GCP & Azure). - Ensure that service levels, such as uptime and performance, meet the expected standards. Support Tickets & Issue Resolution : - Work on support tickets raised by platform users, addressing technical problems and providing timely solutions to ensure smooth operations. Incident Management : - Lead the management and resolution of incidents, minimizing downtime and ensuring quick recovery. - Manage the incident lifecycle from detection to resolution, coordinating across teams as necessary. Root Cause Analysis & Problem Management : - Perform root cause analysis for incidents and recurring problems to prevent future occurrences. Document findings and implement preventive measures to maintain service reliability. Automation & Optimization : - Write scripts and automation tools (primarily using Python) to reduce manual intervention and optimize operational tasks, driving efficiency and consistency. Cloud Monitoring & Operations (GCP & Azure) : - Leverage your expertise in cloud technologies to monitor and manage resources in GCP and Azure environments. - Ensure seamless integration, configuration, and scaling of cloud services. ServiceNow Integration : - Use ServiceNow for managing and tracking incidents, requests, and changes. Ensure proper documentation and ticket management following ITIL best practices. Collaboration with Cross-functional Teams : - Work closely with development, operations, and other engineering teams to maintain a unified approach to platform reliability and performance. Provide inputs for continuous improvement of the platform and processes. Required Skills & Qualifications : Cloud Monitoring & Operations : - Proven experience in managing operations across Google Cloud Platform (GCP) and Microsoft Azure. - Hands-on experience with cloud monitoring tools and techniques. Incident Management : - Experience in leading incident response efforts, coordinating across teams, and minimizing the impact of outages. Scripting & Automation : - Proficiency in Python for automation tasks. Knowledge of other scripting languages is a plus. ServiceNow : - Familiarity with ServiceNow for incident tracking and service management. - Ability to integrate ServiceNow into existing operational workflows. Problem-solving & Analytical Thinking : - Strong skills in root cause analysis, problem management, and preventive maintenance. Communication & Collaboration : - Excellent communication skills with a focus on cross-team collaboration, customer service, and continuous improvement. (ref:hirist.tech)
Location: bangalore, IN
Posted Date: 11/22/2024
Location: bangalore, IN
Posted Date: 11/22/2024
Contact Information
Contact | Human Resources Arting Digital |
---|