Railway - Senior Infra Engineer: Baremetal Orchestration
Responsibilities
• Build and maintain our host provisioning stack: PXE boot, Ansible, and burn-in agents that bring new bare metal online quickly and confidently • Continue to evolve our homegrown orchestration engine to manage clusters, containers, and VMs through a single lens • Optimize the efficiency of our bin packing algorithm to maximize utilization/performance and minimize costs • Own the internal tooling that Railway engineers use to interact with our fleet every day • Build out internal observability and alerting so we catch fleet problems before customers feel them • Design and maintain the CI pipelines that ship our infrastructure code safely • Define infrastructure that can be torn down, failed over, and reconstituted from scratch using principle of immutable infrastructure using Terraform and Ansible • Build Golang/Rust GRPC services from scratch capable of supporting millions of users • Write Engineering Requirement Documents to take something from idea, to defined tasks, to implementation, to monitoring its success • The arc of this role is more internal-facing than user-facing. You're building the platform that Railway engineers run on. This is a high impact, high agency role with direct effect on company culture, trajectory, and outcome. • A strong understanding of distributed systems and what it takes to operate them. You enjoy building fault tolerant, resilient, and scalable services, and you care about what happens when they break at 3am • Hands-on experience with bare metal provisioning, configuration management, and the unglamorous-but-critical work of getting hardware production-ready • Comfort building and operating internal tools. You understand that developer experience inside the company matters as much as the product outside it • A solid intuition about how long your solutions will last. All systems age. In startups, we can hope for 2-3 orders of magnitude, or 12-18mo • The tact to implement your solution, create monitors for its error boundaries, and document any requirements for when you're not around • A great sense of direction and prioritization when it comes to dealing with the ambiguity of an early stage startup • A sense of grit to dive into a problem, implement a solution, scale that solution, and replace it when needed • A great set of communication skills for getting your point across, solution implemented, and beyond • We value and love to work with diverse persons from all backgrounds • Things to know • For better or worse, we're a startup; our team dynamics are different from companies of different sizes and stages. • We're distributed ALL across the globe, and that's only going to be more and more distributed. As a result, stuff is ALWAYS happening. • We do NOT expect you to work all the time, but you'll have to be diligent about your boundaries because the end of your day may overlap with the start of someone else's. • We're a small team, with high ownership, who are not only passionate about what we do, but seek to be exceptional as well. At the time of writing we're 21, serving hundreds of thousands of users. There's a lot of stuff going on, and a lot of ambiguity. • We want you to own it. We believe that ownership is a key to growth, and part of that growth is not only being able to make the choices, but owning the success, or failure, that comes with those choices.
Benefits
• At Railway, we provide best in class benefits. Great salary, full health benefits including dependents, strong equity grants, equipment stipend, and much more. For more details, check back on the main careers page. • Beyond compensation, there are a few things that we believe that make working at Railway truly unique: • Autonomy: We have very few meetings. Just a Monday and a Friday to go over the Company Board. We think your time is sacred, whether it's at work, or outside of work. • Autonomy • Ownership: We're a company with a high ownership, high autonomy culture. We hope that you'll come in, help us, and over the course of many years do the best work of your life. When we bring you onboard, we expect you to change the company. • Ownership • Novel problems/solutions: We're a startup that's well funded, with cool problems, which lets us implement novel solutions! We abhor “busywork” and think, whether it's community, engineering, operations, etc there's always opportunity for creative and high leverage solutions. • Novel problems/solutions • Growth: We want you to grow with us, but we know that talent is loaned, so when you figure out what area you want to grow in next, whether it's at Railway or outside, we'll make sure you land there. • Growth • How we hire • No tricks. No surprises. Here's the entire process.
Apply in one click
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT