Cloud Infrastructure Engineer (Kineto)
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Responsibilities
• Cloud and platform engineering (DevOps): • Design, implement, and manage the core infrastructure powering Kineto's platform on Google Cloud Platform (GCP), including networking, security, and identity management. • Build and operate resilient, highly available distributed systems using Kubernetes (GKE), Knative, Istio, and related cloud-native technologies. • Automate the entire infrastructure life cycle (IaC) using Terraform and Terragrunt, ensuring secure, reproducible, and auditable environments. • Implement and maintain CI/CD pipelines (e.g. GitHub Actions and TeamCity) and deployment tools like Flux and Helm for GitOps-driven application delivery. • Optimize and manage the multi-tenant data layer on Postgres and Neon, focusing on robust tenant isolation, performance, backups, and safe schema management. • Operational excellence and reliability: • Drive site reliability engineering (SRE) practices, including monitoring, alerting (Prometheus, Grafana), logging (Loki), and incident response. • Solve complex operational challenges, such as optimizing scale to zero for cost efficiency, minimizing cold starts, enhancing autoscaling behavior, and managing queue backpressure. • Implement platform-wide performance tuning (e.g. container resource limits, distributed locks, caching strategies, and GC configurations). • Ensure platform security and compliance by implementing best practices for secrets management, network segmentation, and vulnerability scanning. • Technical leadership: • Own major infrastructure roadmap items, including multi-region deployments, disaster recovery planning, advanced tenancy separation, and ephemeral preview environments. • Champion DevOps and SRE principles across the engineering team, mentoring engineers on cloud-native best practices, operational readiness, and debugging complex distributed systems. • Collaborate with product and engineering teams to define the long-term vision for the platform's architecture and operational model. • We’d be glad to have you on our team if you: • Have five or more years of experience building and operating large-scale, commercial cloud-native infrastructure, with a strong focus on DevOps/SRE practices. • Possess deep, hands-on expertise with GCP (or AWS/Azure) and Kubernetes administration and operations (GKE experience is a strong plus). • Are proficient with infrastructure-as-code (IaC) tools, particularly Terraform, for managing complex environments. • Have a solid understanding of Linux internals, networking (CNI and service mesh), security, and distributed system design. • Are familiar with CI/CD tools, GitOps (e.g. Flux), monitoring stacks (Prometheus/Grafana), and logging systems. • Thrive in cross-functional teams and excel at communicating complex infrastructure ideas clearly. • We process the data provided in your job application in accordance with the Recruitment Privacy Policy.