Alpaca - Site Reliability Engineer
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• 4+ years in SRE, DevOps, Platform/Infrastructure, or backend engineering with significant production operations ownership. • 4+ years • Hands-on experience operating production services on Kubernetes, and shipping infrastructure as code in a GitOps workflow. • Kubernetes • Solid working knowledge of PostgreSQL in production — query plans, pg_stat_*, indexing and schema trade-offs, and what a safe online migration looks like on a non-trivial table. • Solid working knowledge of PostgreSQL in production • Cloud networking fundamentals (VPCs, routing, L4/L7 load balancing, DNS, TLS) and comfort debugging cross-service connectivity. • Cloud networking fundamentals • Comfortable with a modern observability stack and proficient with Linux at the operator level. • proficient with Linux • Practiced in incident response - calm under pressure, structured debugging, postmortems that drive change. • incident response • At least working proficiency in Go or Python, plus strong written and verbal communication. • Go or Python • Genuine interest in databases and in growing your PostgreSQL/DBA expertise. • Who You Might Be (Nice-to-Haves): • Deeper PostgreSQL experience: large clusters at OLTP load, online migrations on big tables, HA/DR ownership, connection pooling at scale, or change-data-capture pipelines. • Experience with typed SQL access layers in Go (e.g. pgx, gorm, sqlc). • Production experience with messaging systems at scale (e.g. RabbitMQ, Kafka, Redpanda). • Security & compliance experience in a regulated environment (SOC 2, secrets management, audit logging). • Familiarity with trading, brokerage, or other regulated fintech domains. • How We Take Care of You: • Competitive Salary & Stock Options • Health Benefits • New Hire Home-Office Setup: One-time USD $500 • Monthly Stipend: USD $150 per month via a Brex Card • Alpaca is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce. • Recruitment Privacy Policy
Responsibilities
• As a Site Reliability Engineer at Alpaca, you'll help keep our brokerage platform reliable, observable, and operable as we grow - working across our cloud infrastructure, Kubernetes platform, observability stack, messaging layer, and data layer. We're especially interested in candidates with strong PostgreSQL fundamentals who'd like to grow into deeper ownership of our database reliability posture: PostgreSQL sits on the trading-critical path, and we want this person to spend a meaningful share of their time leveling it up while still being a well-rounded SRE the rest of the week. • Things You Get To Do • Operate production day-to-day - oncall, incident response, postmortems, and the follow-ups that actually close the loop. • Operate production day-to-day • Own reliability practice - define and refine SLIs/SLOs and error budgets, and help product teams live within them. • Own reliability practice • Strengthen our observability across metrics, logs, traces, and alerting. • Strengthen our observability • Ship infrastructure through code in a GitOps workflow - cloud resources and Kubernetes workloads alike. • Ship infrastructure through code • Look after PostgreSQL: performance tuning, schema and migration review, online migrations on large tables, HA/DR, and CDC pipelines. • Look after PostgreSQL • Mentor engineers on reliability and database fundamentals through code review, design review, and pairing. • Mentor engineers
Benefits
• New Hire Home-Office Setup: One-time USD $500 • Monthly Stipend: USD $150 per month via a Brex Card • Alpaca is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce. • Recruitment Privacy Policy
No credit card. Takes 10 seconds.