wagey.ggwagey.gg
38,923  jobs38,923  jobs
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs(38,923)/Site Reliability Engineer Role(218)/GRAIL (3) - Staff Site Reliability Engineer (SRE)
GRAIL

GRAIL - Staff Site Reliability Engineer (SRE)

Menlo Park, CA - Hybrid$169k - $224k2mo ago
In OfficeStaffNABiotechnologyCloud ComputingSite Reliability EngineerDevOps EngineerGoBashKubernetesPythonAzure

Requirements

• BS in Computer Science, Engineering, or related field, or equivalent experience • 8+ years of experience in Site Reliability Engineering, DevOps, or platform engineering • Strong hands-on experience with at least one major cloud platform (AWS, GCP, or Azure) • Experience implementing infrastructure-as-code solutions (Terraform, CloudFormation, or similar) • Experience designing and operating CI/CD pipelines (e.g., GitLab CI, GitHub Actions, Jenkins) • Hands-on experience with Kubernetes and containerized systems in production environments • Proficiency in scripting or programming for automation (e.g., Python, Go, Bash, or PowerShell) • Experience with observability and monitoring tools (e.g., Prometheus, Grafana, OpenTelemetry, Datadog) • Strong understanding of networking, security, and distributed systems fundamentals • Experience working in regulated environments and familiarity with frameworks such as ISO 27001, NIST, SOC 2, or HIPAA • 10+ years of experience in SRE, DevOps, or infrastructure engineering • Experience operating multi-cluster Kubernetes environments (e.g., EKS, GKE) at scale • Familiarity with GitOps practices (e.g., ArgoCD, Flux) • Experience with data platforms and pipelines (e.g., Kafka, Airflow, Spark, Snowflake, BigQuery) • Experience implementing SLO/SLI frameworks and reliability practices across multiple teams • Strong background in cloud security, including IAM, zero-trust architecture, and secrets management • Experience with compliance-as-code and security tooling (e.g., OPA, Snyk, Checkov) • Exposure to AI/ML or large-scale data infrastructure workloads • Experience in healthcare, biotech, or other regulated industries • Relevant cloud or Kubernetes certifications (e.g., AWS DevOps, CKA/CKS, GCP DevOps) • Physical Demands and Working Environment • Standard office environment with hybrid flexibility • Participation in on-call rotation and after-hours support for critical systems may be required • Frequent collaboration with cross-functional and senior stakeholders • Fast-paced, dynamic environment with emphasis on reliability, scalability, and innovation • Adaptability and Growth Expectation

Responsibilities

• What Success Looks Like in Your First Year • Conduct a comprehensive assessment of the current infrastructure, drive infrastructure-as-code adoption to 95%+ across critical systems, and establish clear health and reliability baselines for the Kubernetes platform • Standardize observability using modern tooling and implement an SLO/SLI framework adopted across multiple product teams, including defined SLAs for critical data systems • Strengthen security and compliance posture across cloud environments by implementing consistent baselines, launching a compliance-as-code framework, and reducing mean time to resolution (MTTR) for production incidents • Define, document, and drive adoption of engineering standards, best practices, and operational guidelines across platform and product teams • Develop and align stakeholders on a forward-looking platform reliability and infrastructure roadmap • Demonstrate measurable mentorship and technical leadership impact across the engineering organization • Evaluate and provide recommendations on emerging infrastructure needs, including support for AI/ML and advanced data workloads • Participating in cross-functional initiatives and strategic projects • Adapting to new technologies, tools, and methodologies • Supporting other teams during periods of high demand • The expected, full-time, annual base pay scale for this position is $169K - $224K. Actual base pay will consider skills, experience, and location. • This role may be eligible for other forms of compensation, including an annual bonus and/or incentives, subject to the terms of the applicable plans and Company discretion. This range reflects a good-faith estimate of the range that the Company reasonably expects to pay for the position upon hire; the actual compensation offered may vary depending on factors such as the candidate’s qualifications. Employees in this role are also eligible for GRAIL’s comprehensive and competitive benefits package, offered in accordance with our applicable plans and policies. This package currently includes flexible time-off or vacation; a 401(k) retirement plan with employer match; medical, dental, and vision coverage; and carefully selected mindfulness programs.

Apply in one click

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Similar roles

deepgramdeepgram - Site Reliability Engineer - AI & ML Infrastructure (Kubernetes, AWS & Terraform)3mo ago
·Remote, California, United States - Hybrid·$150k - $220k/year
In OfficeNAInternCloud ComputingArtificial IntelligenceSite Reliability EngineerGoBashPythonKubernetesAWS
Unstructured Technologies Inc.Unstructured Technologies Inc. - Site Reliability Engineer3w ago
·Remote - USA *·Equity
RemoteNAMidSite Reliability EngineerKubernetesPythonGo
PragmatikePragmatike - DevOps Engineer3w ago
·Remote - NA - USA *
RemoteNASeniorCloud ComputingSoftwareDevOps EngineerGoBashPythonPipeline ManagementAWS
OXIO CorporationOXIO Corporation - Site Reliability Engineer1mo ago
·Remote - USA
RemoteNACloud ComputingTelecommunicationsSite Reliability EngineerGoRubyBashPerlPython
ClickHouseClickHouse - Senior Site Reliability Engineer- Remote3mo ago
·Remote - USA·$208k - $208k/year + Equity
RemoteNASeniorCloud ComputingSite Reliability EngineerPythonGoAWSAzureSQL
SwapSwap - DevOps Engineer3mo ago
·Remote - Americas
RemoteNASeniorCloud ComputingDevOps EngineerBashPythonJavaScriptDockerKubernetes
Chainlink LabsChainlink Labs - Site Reliability Engineer II3mo ago
·Remote - Canada, United States, Brazil...
RemoteNAMidCryptocurrencyCloud ComputingSite Reliability EngineerGoShellPythonKubernetesTerraform
SardineSardine - DevOps Engineer5mo ago
·Remote - United States·$160k - $200k/year + Equity
RemoteNASeniorCloud ComputingBankingPaymentsGovernmentDevOps EngineerPythonGoKubernetesTerraformHelm
truemltrueml - Sr. DevOps Engineer2w ago
·Remote - USA
RemoteNASeniorCloud ComputingDevOps EngineerGoBashJenkinsPythonAWSKubernetesDockerTerraformDatadog

Browse more by category

Show 218 moreSite Reliability EngineerShow 213 moreDevOps EngineerShow 2,041 moreGoShow 466 moreBashShow 1,860 moreKubernetesShow 6,205 morePythonShow 1,615 moreAzure
Privacy·Terms··Contact·FAQ·Wagey on X