GitLab - Intermediate Site Reliability Engineer, Environment Automation
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• Experience working as an SRE or in a similar role operating production infrastructure, with an interest in automating the lifecycle of many environments or tenants in parallel, even if you have not yet done so at large scale. • Hands-on experience with Golang (required) and the ability to read, understand, and modify infrastructure tools written in Go. • Hands-on experience running Kubernetes-based workloads in production, including basic understanding of deployments, rollouts, and debugging common issues like crash loops, failed health checks, and scheduling problems. • Familiarity with infrastructure automation and configuration management tools such as Terraform and Ansible, including experience working with modules, variables, and managing state safely for multiple environments. • Solid understanding of Git-based workflows and infrastructure-as-code practices, with the ability to contribute to reusable modules, templates, and pipelines that make automation safer and more consistent. • Experience working in distributed systems or cloud-based production environments, ideally in SaaS or managed service settings, with comfort participating in incident response and on-call rotations under guidance from more senior team members. • A proactive mindset focused on automation and documentation—you look for opportunities to remove manual steps, improve runbooks, and turn repetitive tasks into reliable, self-service tools. • Comfort working asynchronously across distributed teams and a desire to contribute to GitLab's values of collaboration, transparency, and iteration. • We are responsible for building, running, and evolving the entire lifecycle of the GitLab environments that power the GitLab Dedicated platform. You'll be part of our team focused on owning the reliability, scalability, performance, and security of automated single-tenant GitLab instances and their supporting services. GitLab Dedicated provides fully managed, isolated environments for customers around the world, which means your work directly impacts how organizations of all sizes run their mission-critical software delivery on GitLab. We operate in a fully distributed, asynchronous environment across multiple regions, collaborating on everything from infrastructure automation and environment lifecycle design to incident response and capacity planning. You'll be solving novel challenges at scale, from orchestrating infrastructure-as-code workflows across hundreds of tenants to designing the automation that keeps those environments consistent, secure, and up to date. We continuously seek to reduce complexity and improve efficiency by leveraging cloud vendor managed products and services where appropriate, ensuring GitLab Dedicated remains a best-in-class managed platform for our customers. For more on how we operate, see the relevant GitLab Dedicated and infrastructure handbook pages. • How GitLab will support you • Benefits to support your health, finances, and well-being • Flexible Paid Time Off • Team Member Resource Groups • Equity Compensation & Employee Stock Purchase Plan • Growth and Development Fund • Please note that we welcome interest from candidates with varying levels of experience; many successful candidates do not meet every single requirement. Additionally, studies have shown that people from underrepresented groups are less likely to apply to a job unless they meet every single qualification. If you're excited about this role, please apply and allow our recruiters to assess your application. • Country Hiring Guidelines: GitLab hires new team members in countries around the world. All of our roles are remote, however some roles may carry specific location-based eligibility requirements. Our Talent Acquisition team can help answer any questions about location after starting the recruiting process. • Country Hiring Guidelines:
Responsibilities
• Contribute to automating operational tasks across many GitLab environments, from initial provisioning and configuration updates to upgrades and routine maintenance, helping reduce manual work and improve reliability at scale under the guidance of senior team members. • Help build and refine the observability stack for multi-tenant GitLab environments so we monitor the right signals across Kubernetes, cloud services, and GitLab applications, supporting early issue detection and basic capacity tracking. • Assist in responding to platform alerts and incidents, collaborating with Environment Automation SREs and engineering teams to troubleshoot production issues across multiple tenants and document findings. • Support planning and implementation of infrastructure changes, capacity expansions, and new service rollouts for Dedicated and other managed GitLab environments, contributing to efforts that improve resource efficiency and environment isolation. • Develop and maintain scripts, automation tools, and infrastructure-as-code workflows that manage parts of the GitLab environment lifecycle, enabling more repeatable, self-service operations over time. • Participate in the on-call rotation for production GitLab environments with appropriate support, helping triage and mitigate incidents across clusters and cloud providers and contributing to post-incident reviews. • Document operational tasks, runbooks, and lessons learned so they become clear, repeatable processes and can be candidates for future automation, improving shared knowledge and reducing manual toil across the team.
Similar Jobs
No credit card. Takes 10 seconds.