wagey.ggwagey.gg
38,923  jobs38,923  jobs
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs(38,923)/Site Reliability Engineer Role(222)/Okta, Inc. (5) - Staff Site Reliability Engineer
Okta, Inc.

Okta, Inc. - Staff Site Reliability Engineer

Bengaluru, India3w ago
In OfficeStaffAPACCloud ComputingSoftwareSite Reliability EngineerKubernetesTimeline ManagementAWSGCPHelmTerraformPythonGoGoogle GKELinuxAnsible

Requirements

• 8+ years in SRE, DevOps, or Infrastructure Engineering roles. • 3–5 years of experience with Kubernetes (EKS/GKE) and related ecosystem tools (Helm, Karpenter, etc.) in production. • 3–5 years of experience with AWS and GCP. • 3–5 years using Terraform to manage multi-cloud infrastructure. • 5+ years of coding experience in Python, Go, or similar languages. • Proven track record leading high-impact projects, specifically migration projects (ECS → EKS/GKE) and enabling microservice architectures. • Experience implementing SLOs/SLIs, performing root cause analyses, and improving operational resilience. • Prior work in SaaS or high-scale, cloud-native environments is a strong plus. • Strong Linux and security fundamentals. • Bachelor’s degree in Computer Science or equivalent hands-on experience. • Supporting Your Well-Being • Driving Social Impact • Developing Talent and Fostering Connection + Community • We are intentional about connection. Our global community, spanning over 20 offices worldwide, is united by a drive to innovate. Your journey begins with an immersive, in-person onboarding experience designed to accelerate your impact and connect you to our mission and team from day one.

Responsibilities

• Design, build, and operate highly scalable, reliable, and secure infrastructure powering our production systems across AWS and GCP. • Lead major reliability and modernization initiatives, including container platform migrations (e.g., ECS to EKS/GKE) and microservice enablement across multi-cloud environments. • Serve as a technical authority in Kubernetes (EKS and GKE), cloud infrastructure (AWS and GCP), and modern CI/CD practices (GitOps, automation pipelines). • Partner with development teams to architect and enable microservice-based applications, ensuring production readiness, scalability, and observability. • Implement and manage infrastructure as code (Terraform, Ansible) to automate provisioning, scaling, and configuration management across multiple cloud providers. • Drive improvements in observability, performance, and cost efficiency through robust monitoring, logging, and alerting systems that span AWS and GCP. • Champion SRE best practices — defining SLOs/SLIs, conducting blameless postmortems, and continuously improving incident response. • Lead complex technical projects from conception to completion, managing timelines, and technical dependencies across teams. • Mentor engineers across teams, fostering a culture of reliability, automation, and continuous learning. • Collaborate with security and compliance partners to ensure infrastructure adheres to best practices and standards (e.g., IAM Federation, Workload Identity). • Participate in the on-call rotation, using incidents as learning opportunities to enhance systems and processes.

Benefits

• This work requires a relentless drive to solve complex challenges with real-world stakes.

Apply in one click

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Similar roles

k-IDk-ID - Lead Site Reliability Engineer2mo ago
·Singapore·Equity
In OfficeAPACStaffCloud ComputingSite Reliability EngineerGoPythonTypeScriptAWSKubernetes
GitLabGitLab - Intermediate Site Reliability Engineer, Environment Automation3mo ago
·Remote - India·Equity
RemoteAPACMidCloud ComputingSoftwareSite Reliability EngineerGoKubernetesTerraformGitAnsible
MegaportMegaport - Senior Site Reliability Engineer1w ago
·Remote - Brisbane, Queensland
RemoteAPACSeniorCloud ComputingMaterialsSite Reliability EngineerLinuxKubernetesAWSBashGoPythonTerraformCassandra
Backblaze External WebsiteBackblaze External Website - Site Reliability Engineer II1mo ago
·Remote - Bangalore
RemoteAPACMidCloud ComputingSoftwareSite Reliability EngineerBashGoPythonLinuxDocker
k-IDk-ID - Senior Site Reliability Engineer2mo ago
·Remote - Singapore
RemoteAPACSeniorCloud ComputingSite Reliability EngineerGoPythonTypeScriptAWSKubernetes
PlaudPlaud - Senior Site Reliability Engineer5mo ago
·Singapore·Equity
In OfficeAPACSeniorCloud ComputingArtificial IntelligenceSite Reliability EngineerJavaGoPythonAWSGCP
dittoditto - Senior Site Reliability Engineer, APAC2mo ago
·Remote - APAC·$108k - $169k/year + Equity
RemoteAPACSeniorCloud ComputingSoftwareSite Reliability EngineerGoRustC++JavaPython
New Era TechnologyNew Era Technology - Site Reliability Engineer (SRE)3mo ago
·India
In OfficeAPACMidPaymentsCloud ComputingSite Reliability EngineerGoBashJavaPythonLinux
RedditReddit - Staff Site Reliability Engineer - Site Experience1mo ago
·Remote - UK
RemoteEMEAStaffCloud ComputingSite Reliability EngineerGoPythonPerformance ManagementLinuxKubernetes

Browse more by category

Show 222 moreSite Reliability EngineerShow 1,928 moreKubernetesShow 164 moreTimeline ManagementShow 3,841 moreAWSShow 1,568 moreGCPShow 142 moreHelmShow 1,191 moreTerraformShow 6,338 morePythonShow 2,085 moreGoShow 37 moreGoogle GKE
Privacy·Terms··Contact·FAQ·Wagey on X