Site Reliability Engineer Manager
¿Quieres ser parte del equipo?
Information
We seek an experienced Site Reliability Engineer with a Bachelor's degree in Computer Science, Engineering, or a related field (Master’s degree preferred). Ideal candidates will have 5+ years of experience in site reliability engineering, cloud operations, or infrastructure management, with at least 2 years in a leadership role managing high-performing teams.
Skills and Expertise: Candidates should demonstrate advanced proficiency in AWS services (EC2, S3, RDS, Lambda, IAM, CloudWatch, CloudFormation) and scripting in languages like Python, Bash, or PowerShell. Experience with Docker and Kubernetes, a strong grasp of DevOps principles, CI/CD pipelines, and IaC practices are essential. AWS certifications are preferred, alongside skills in Linux server management, problem-solving, and cross-functional collaboration. Familiarity with on-call support systems (e.g., OpsGenie) and fluency in English are also required.
Key Responsibilities: The role includes forming, leading, and mentoring an SRE team to maintain the AWS infrastructure’s reliability, scalability, and performance. This position will define SRE best practices, manage incidents, and drive automation for infrastructure and deployment processes using AWS and DevOps tools. Collaboration with development, operations, and security teams is essential to align with business and compliance needs. Responsibilities also cover setting service-level objectives and indicators, establishing monitoring solutions, leading incident response during outages, and initiating continuous improvements to enhance system resilience and performance. Staying informed about industry trends and AWS innovations is crucial to maintaining competitive advantage.
Nombre de la empresa
Stori
Location
México
Apply here
Esquema de trabajo
Full remoto
Cuántas veces por semana hay que ir a la oficina
0
Años de experiencia requeridos
5