tem - Senior Staff MLOps Engineer
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• Scaled an ML platform from early-stage: Demonstrable experience taking an ML platform from early stages to best-in-class infrastructure at a fast-moving company. You've been there, done it, and you're comfortable with the messiness and ambiguity that comes with scale-up life. • ML pipeline expertise: Deep experience across the whole MLOps lifecycle with ML pipeline orchestration (Metaflow, Prefect, Airflow or equivalent) and ML infrastructure (Sagemaker, Vertex AI, Chalk, or equivalent). • Model lifecycle tooling: Hands-on experience building or operating experiment tracking systems (MLflow, W&B, or similar), model registries, and governance tooling for model fleets at scale. Knows what good looks like and what to avoid. • Broad MLOps tooling knowledge: Across the ecosystem monitoring, drift detection, CI/CD for ML, containerisation, IaC (Terraform, AWS CDK). Able to evaluate trade-offs and make principled choices for a specific context, not just default to what they know. • Technical leadership track record: Evidence of setting platform direction, influencing cross-functional teams, and defining standards at Staff+ level. Raises the quality bar through design reviews, code reviews, and mentoring. Knows when to drive strategy and when to get into the weeds. • Heterogeneous workload experience: Experience designing and operating platforms serving heterogeneous workloads (e.g. forecasting, classification, operations research, etc), not just one model type across batch and real time applications. • Python, AWS + IaC: Strong Python; hands-on experience with AWS and infrastructure-as-code (Terraform, AWS CDK). • Worked in a role where ML is at the core of the product • Familiarity with Metaflow specifically • Experience with operations research, large-scale optimisation in a production context • Experience working with business critical time series forecasting models • Exposure to reinforcement learning in a production setting • Exposure to production LLM workloads e.g. fine tuning • 🗣️ Interview Process: • Our processes normally take around 2-3 weeks from first call to offer - please let us know about any adjustments to timelines that may be required. • 1. First call with our Talent Team (30 mins). This is to understand your experience, motivations, and discuss the role in more detail. • 2. Behaviour Interview with Tim, Head of Data (60 mins). This is your chance to really understand the role, the expectations, and ensure alignment on ways of working. • 3. Technical Interview with the Team (90 mins). You'll meet with potential peers in this session and work through a live technical exercise. • 4. Culture-Add Interview with Stakeholders (45 mins). The final session will be with two cross-functional stakeholders, and will explore how your values align with ours, and is designed to be a genuine two-way conversation, your chance to understand what it's really like to work at tem.
Responsibilities
• Own the ML platform strategy: Define the roadmap from Level 1 to Level 2 https://docs.cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning#characteristics, making architectural decisions ahead of when they'd otherwise become blockers. Keep the platform aligned to Rosso's commercial trajectory. • Build the foundations: Lead the design and build of experiment tracking, model registry, automated pipeline infrastructure, and production monitoring across all model types. • Deliver backtesting and shadow deployments: Build the infrastructure the forecasting and pricing teams need to validate models reliably against historical data and in production before they go live. • Set technical direction: Provide the architectural vision and standards the Senior MLOps Engineer executes against. This is a force-multiplier relationship, not a management one. • Partner across the team: Work closely with ML engineers and software engineers to understand what the platform needs to unlock the next wave of Rosso capabilities. Translate those needs into principled platform decisions. • Choose the right tools: Evaluate the MLOps tooling ecosystem with clear eyes. Make choices that fit tem's scale and workload mix not what's fashionable. • Drive deployment reliability: Push toward more frequent, reliable model deployment cycles as Rosso moves from batch-heavy workflows toward live, near-real-time processes. • Define best practices: Establish standards for how models are trained, versioned, deployed, and monitored across the team. Create a platform ML engineers trust. • What success looks like: • MLOps is no longer a bottleneck, ML engineers are unblocked to focus on model quality • The time to deploy new machine learning models goes from days to minutes • The core features required from the machine learning platform are delivered before they block progress e.g. backtesting and experiment tracking
Benefits
• £130K • Offers Equity • Upload your resume here to autofill key application fields. • Drop your resume here! • Parsing your resume. Autofilling key fields... • Please let us know what to call you! • If you’d like, please let us know your pronouns so we can address you respectfully throughout the process. • We need to know how to contact you. • Please let us know what excites you about this opportunity! • Please add your CV here for review. • or drag and drop here • Feel free to leave blank if you don't have one. • Where you are currently based. • We just want to make sure we're in the same ballpark before asking you to spend time interviewing. Totally fine to leave it blank if you'd rather not say right now. • I prefer not to answer • Another Gender Identity • Heterosexual / straight • Asian or Asian American • Black or African American • Hispanic or Latine • Indigenous or Native American • Native Hawaiian or Other Pacific Islander • Person with disability • Refugee or immigrant • None of the above • If you agree to our Privacy Policy, and for us to retain your data for future hiring purposes, please click the "I agree" checkbox below. • Note: The consent period lasts for 2 years • Recruiting Privacy Policy
No credit card. Takes 10 seconds.