Software Engineer, Infrastructure
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Responsibilities
• Designing and building distributed systems that support reliability, resiliency, and safe operation at scale • Designing and operating traffic control mechanisms: circuit breakers, rate limiting, admission control, backpressure, and graceful degradation • Building and evolving load testing frameworks that validate system behavior under sustained, burst, and peak event traffic patterns • Building chaos and resilience testing infrastructure to proactively surface failure modes and validate recovery behavior • Building systems that enable teams to define and implement SLOs, SLIs, and error budgets to guide reliability tradeoffs • Developing tooling that improves incident detection, response, and automated mitigation • Reviewing service architectures with a focus on failure modes, scalability limits, and operational safety • Participating in incident response and driving systemic fixes that reduce repeated failure patterns • This is a highly visible role. The Reliability team provides foundational systems and frameworks that allow Whatnot to scale rapidly while remaining stable and trustworthy for buyers and sellers.