Hirelo

Site Reliability Engineer - L2 & L3

Job Location

bangalore, India

Job Description

Key Responsibilities : - Incident Management : Provide L2 support for critical production incidents, performing root cause analysis, and implementing effective solutions to minimize downtime. - Automation and Infrastructure as Code (IaC) : Develop and maintain automation scripts using Python, Bash, and Go to streamline operational tasks. Implement and manage IaC using Terraform and Ansible to automate infrastructure provisioning and configuration. - UNIX Systems Administration : Manage and troubleshoot critical applications running in a UNIX environment, ensuring system stability and performance. - Database Management : Administer and optimize production databases (Postgres, MySQL, Oracle) in both cloud and on-premise environments. Perform database backups, restores, and performance tuning. - Cloud Infrastructure Management : Design, deploy, and manage infrastructure on AWS and/or Azure cloud platforms. Implement best practices for security, scalability, and cost optimization. - Containerization and Orchestration : Deploy, manage, and troubleshoot Kubernetes clusters. Ensure high availability and scalability of containerized applications. - Monitoring and Logging : Implement and maintain monitoring and logging solutions using the ELK stack (Elasticsearch, Logstash, Kibana) to proactively identify and resolve issues. - Performance Tuning and Optimization : Analyze system performance metrics, identify bottlenecks, and implement solutions to optimize performance. - Collaboration and Communication : Collaborate with cross-functional teams to resolve issues and implement improvements. Communicate effectively with stakeholders and provide clear and concise documentation. - On-Call Support : Participate in an on-call rotation to provide 24/7 support for critical systems. - Documentation : create and maintain detailed documentation of systems, procedures, and troubleshooting steps. Required Skills and Experience : - Experience : 5-8 years of experience in an L2 Site Reliability Engineer, DevOps Engineer, or similar role. - Scripting : Proficiency in scripting languages such as Python, Bash, and Go. - Infrastructure as Code : Hands-on experience with Terraform and Ansible for infrastructure automation. - UNIX Systems : Strong experience supporting critical applications in a UNIX environment. - Database Management : Expertise in managing production databases (Postgres, MySQL, Oracle) in cloud and on-premise environments. - Cloud Platforms : Extensive experience with AWS and/or Azure cloud environments. - Containerization : Solid understanding of Kubernetes and containerization technologies. - Monitoring and Logging : Experience with the ELK stack for monitoring and logging. - Education : Bachelor's or Master's degree in Computer Science or a related field with 5 years of relevant experience. - Problem-Solving : Excellent problem-solving and troubleshooting skills. - Communication : Strong communication and collaboration skills. Preferred Qualifications : - Relevant certifications (e.g., AWS Certified DevOps Engineer, Kubernetes Administrator, Oracle Database Administrator). - Experience with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI). - Knowledge of networking concepts and protocols (TCP/IP, DNS, HTTP). - Experience with configuration management tools (e.g., Chef, Puppet). - Experience with other monitoring tools (Prometheus, Grafana). (ref:hirist.tech)

Location: bangalore, IN

Posted Date: 5/9/2025
View More Hirelo Jobs

Contact Information

Contact Human Resources
Hirelo

Posted

May 9, 2025
UID: 5120625471

AboutJobs.com does not guarantee the validity or accuracy of the job information posted in this database. It is the job seeker's responsibility to independently review all posting companies, contracts and job offers.