Site Reliability Engineer
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• Strong programming or scripting skills (Go, Python, Bash, or similar) for automation, tooling, and operational tasks. • Hands-on experience with cloud infrastructure, ideally Google Cloud Platform (GCP). • Familiarity with containerization and orchestration (Docker, Kubernetes, or equivalent). • Experience with infrastructure-as-code tools (Terraform, Cloud Deployment Manager, or similar). • Experience with either FluxCD or ArgoCD for GitOps-based delivery. • Solid understanding of distributed systems, microservices architecture, and reliability patterns. • Experience setting up monitoring, logging, alerting, and observability (e.g., Prometheus, Grafana, ELK, distributed tracing). • Strong troubleshooting skills and ability to respond to incidents under pressure. • Knowledge of backup and disaster recovery strategies, database management, and secure operations. • Ownership mindset: proactive, responsible, and committed to system reliability. • Strong communication skills — able to coordinate across technical and non-technical stakeholders. • Comfortable working in a fast-paced, early-stage startup environment. • High integrity, attention to detail, and passion for fintech and programmable banking systems. • Prior experience in fintech, banking, or other highly regulated industries. • Familiarity with compliance, security, and data protection best practices. • Experience with high-availability, high-throughput systems, or financial infrastructure. • Exposure to blockchain or crypto systems integrated with banking. • Experience optimizing cloud infrastructure for cost and performance under rapid growth. • Work alongside a founding team from Monzo and BigPay, bringing top-tier fintech expertise. • Tackle real-world reliability challenges in a regulated, fast-growing fintech environment. • Learn from and collaborate with experienced engineers while developing your SRE career. • Competitive salary and meaningful equity with room for growth. • Be part of a well-funded startup shaping the future of programmable banking.
Responsibilities
• Monitor, maintain, and improve the reliability, availability, and performance of production systems and services. • Build and maintain infrastructure as code (IaC), deployment pipelines, and automation to support continuous delivery, scalability, and disaster recovery. • Respond to incidents, perform root-cause analysis, and drive postmortems to ensure lessons learned are applied. • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups. • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards. • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. • Document operational runbooks, on-call procedures, and system architecture to support maintenance, knowledge sharing, and compliance.
Benefits
• Immediately (No notice period) • More than 3 Months • Yes, I am a citizen or permanent resident • Yes, I hold a valid work visa / permit that does not require employer sponsorship • No, I will require employer sponsorship • Optional – if yes, please briefly describe your needs so we can provide reasonable accommodations. You may leave this blank if not applicable.
Similar Jobs
No credit card. Takes 10 seconds.