wagey.ggwagey.gg
Open Tech JobsCompaniesPricing
Log InGet Started Free
Jobs/Software Engineer Role/Staff Software Engineer, ML Platform

Staff Software Engineer, ML Platform

CakeRemote - ET (Eastern)+ Equity1mo ago
RemoteStaffNACloud ComputingArtificial IntelligenceSoftware EngineerStaff EngineerKubernetesAWSGCPAzureMLOps

Upload My Resume

Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• Core Experience • 10+ years of engineering experience, with significant time spent on infrastructure, platform, or distributed systems. • Deep hands-on experience with Kubernetes in production environments. • Kubernetes • Strong cloud experience across AWS, GCP, and/or Azure. • AWS, GCP, and/or Azure • Proven track record of building and operating secure, scalable MLOps platforms. • secure, scalable MLOps platforms • Technical Strength • Technical Strength • Deep understanding of infrastructure-as-code (e.g., Terraform, Pulumi, CDK). • Strong programming skills in at least one backend language (Go preferred; TypeScript also welcome). • Experience diagnosing and debugging complex production issues. • Familiarity with modern CI/CD, test-driven development, and DevSecOps practices. • Bonus: experience building Kubernetes operators and/or working with service meshes (e.g., Istio). • Ownership & Communication • Comfortable owning large, ambiguous problems from inception to production. • Excellent communicator, able to clearly explain complex systems to both technical and non-technical audiences. • Experience working directly with customers and incorporating feedback into technical decisions. • Ability to operate autonomously while keeping stakeholders informed and aligned. • Mindset • Customer-first and product-oriented. • Curious, adaptable, and eager to learn new systems and domains. • Collaborative, respectful, and willing to lean into hard conversations. • Energized by fast-paced environments and meaningful responsibility.

Responsibilities

• As a Staff Software Engineer, you will play a critical leadership role in building and operating the infrastructure that powers Cake’s AI platform. This is a high-ownership role for an engineer who thrives at the intersection of distributed systems, cloud infrastructure, and developer experience. • high-ownership role • distributed systems, cloud infrastructure, and developer experience • You’ll design and operate the ML platform foundations that both internal teams and customers rely on, owning systems end-to-end from architecture to production. You’ll work closely with customers to translate real-world ML use cases into reliable, scalable platform capabilities. • This role is ideal for someone who wants to be a technical owner, not just an implementer, someone who cares deeply about system quality, operational excellence, and clear communication. • technical owner • Build Enterprise-Scale Infrastructure • Leverage infrastructure-as-code to manage complex cloud environments supporting critical ML and AI initiatives. • Design Kubernetes-native systems, including controllers/operators where appropriate. • Improve platform networking, security, and observability • Sustain Platform Health and Performance • Own critical systems in production, including reliability, scalability, security, and cost efficiency. • Identify and proactively address technical debt, operational risk, and platform bottlenecks. • “Learn by doing” — Quickly ramp up to a complex tech stack (Terraform, Kubernetes, Istio, Crossplane, Go, TypeScript) • Enable Teams and Customers to Move Faster • Create abstractions and tooling that make it easier for teams and customers to deploy, run, and scale AI/ML workloads. • Collaborate directly with customers to understand their ML infrastructure challenges and translate them into platform improvements. • Balance speed and rigor—shipping quickly while maintaining a high bar for quality and safety. • Lead Through Influence • Act as a technical leader and mentor across the engineering organization. • Write clear documentation and design proposals that align stakeholders and drive decisions. • Partner closely with product and leadership to shape platform direction and priorities.

Benefits

• High impact, high ownership: You’ll own foundational systems that directly power customer success. • High impact, high ownership: • Small, senior team: Your work won’t get lost—you’ll shape the platform and engineering culture. • Small, senior team: • Real customers, real problems: You’ll build systems used in production by growing companies. • Real customers, real problems: • Autonomy and trust: We hire experienced engineers and give them room to operate. • Autonomy and trust: • Competitive cash compensation alongside above-market equity upside • Top-tier fully covered medical, dental, and vision insurance • Monthly half day • Citi Bike membership • Monthly wellness stipend • Office equipment stipend, including reimbursement for approved disability-related accommodations • Investment in employee learning and growth opportunities • Cake is committed to providing equal employment opportunities to all employees and job seekers regardless of race, color, religion, national origin, sexual orientation, gender, gender identity, marital status, disability, veteran status, or any other legally protected category. As an equal opportunities employer, we value diversity and its positive impact on our culture. • Cake also complies with the Americans with Disabilities Act (ADA). We are dedicated to working with and providing reasonable accommodation to job applicants with physical or mental disabilities. If you require accommodation, please email us at [email protected] and we will promptly address your request.

Similar Jobs

Software Engineer, Product1h ago
AdaptyvAdaptyv·Switzerland
In OfficeEMEASoftware EngineerProduct ManagerReactTypeScript
Staff Python Software Engineer2h ago
aboundabound·London - Hybrid·Equity
In OfficeEMEAStaffCloud ComputingSoftware EngineerStaff EngineerSQLPythonFastAPIJavaAWSPostgreSQLMySQL
Lead CRO Engineer2h ago
fyxerfyxer·London·£170k/year/year + Equity
In OfficeEMEAStaffArtificial IntelligenceSoftwareCROSoftware EngineerReactCloseTypeScriptNext.jsCROB2BHotjarLTVClaudeCursor
VP of Sales2h ago
RapidFort, Inc.RapidFort, Inc.·Remote - USA *·Equity
RemoteNAVpCybersecurityLogisticsVP of SalesKubernetesRevenue GrowthCloseARR
Sr SDET / Sr QA Automation Engineer (Python, CLI, CI/CD, Containers)2h ago
RapidFort, Inc.RapidFort, Inc.·Remote - USA·$120k - $150k/year
RemoteNASeniorLogisticsPublic SectorAutomation EngineerSDETPythonKubernetespytestLinuxDockerPodmanGitShellHelmReportingSQLite
Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact