wagey.ggwagey.ggv1.0-55c2ce9-10-Apr
Browse Tech JobsCompaniesFeaturesPricing
Log InGet Started Free
Jobs/Engineering Manager Role/Spotify - Machine Learning Engineering Manager - LLM Serving & Infrastructure
Spotify

Spotify - Machine Learning Engineering Manager - LLM Serving & Infrastructure

New York, NY / Boston, MA / United States of America (Home Mix)$176k - $252k1mo ago
RemoteStaffNAArtificial IntelligenceData AnalyticsEngineering ManagerMachine Learning EngineerMLOpsMLflow

Upload My Resume

Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• Lead a high-performing engineering team to develop, build, and deploy a high-scale, low-latency LLM Serving Infrastructure. • Drive the implementation of a unified serving layer to support multiple LLM models and inference types (batch, offline eval flows and real-time/streaming). • Lead all aspects of the development of the Model Registry for deploying, versioning, and running LLMs across production environments. • Ensure successful integration with the core Personalization and Recommendation systems to deliver LLM-powered features. • Define and champion standardized technical interfaces and protocols for efficient model deployment and scaling. • Establish and monitor the serving infrastructure's performance, cost, and reliability, including load balancing, autoscaling, and failure recovery. • Collaborate closely with data science, machine learning research, and feature teams (Autoplay, Home, Search, etc.) to drive the active adoption of the serving infrastructure. • Scale up the serving architecture to handle hundreds of millions of users and high-volume inference requests for internal domain-specific LLMs. • Drive Latency and Cost Optimization: partner with SRE and ML teams to implement techniques like quantization, pruning, and efficient batching to minimize serving latency and cloud compute costs. • Develop Observability and Monitoring: build dashboards and alerting for service health, tracing, A/B test traffic, and latency trends to ensure consistency to defined SLAs. • Contribute to Core LPM Serving: focus on the technical strategy for deploying and maintaining the core Large Personalization Model (LPM). • 5+ years of experience in software or machine learning engineering, with 2+ years of experience managing an engineering team. • Hands-on with ML Engineering: you have deep expertise in building, scaling, and governing high-quality ML systems and datasets, including defining data schemas, handling data lineage, and implementing data validation pipelines (e.g., HuggingFace datasets library or similar internal systems). • Deep technical background in building and operating large-scale, high-velocity Machine Learning/MLOps infrastructure, ideally for personalization, recommendation, or Large Language Models (LLMs). • Proven track record to drive complex projects involving multiple partners and federated contribution models ("one source of truth, many contributors"). • Expertise in designing robust, loosely coupled systems with clean APIs and clear separation of concerns (e.g., distinguishing between fast dev-time tools and rigorous production-like systems). • Experience integrating evaluation and testing into continuous integration/continuous deployment (CI/CD) pipelines to enable rapid 'fork-evaluate-merge' developer workflows. • Solid understanding of experiment tracking and results visualization platforms (e.g., MLFlow, custom UIs). • A pragmatic leader who can balance the need for speed with progressive rigor and production fidelity. • This role is based in New York or Boston. • We offer you the flexibility to work where you work best! There will be some in person meetings, but still allows for flexibility to work from home.

Responsibilities

• Lead a high-performing engineering team to develop, build, and deploy a high-scale, low-latency LLM Serving Infrastructure. • Drive the implementation of a unified serving layer to support multiple LLM models and inference types (batch, offline eval flows and real-time/streaming). • Lead all aspects of the development of the Model Registry for deploying, versioning, and running LLMs across production environments. • Ensure successful integration with the core Personalization and Recommendation systems to deliver LLM-powered features. • Define and champion standardized technical interfaces and protocols for efficient model deployment and scaling. • Establish and monitor the serving infrastructure's performance, cost, and reliability, including load balancing, autoscaling, and failure recovery. • Collaborate closely with data science, machine learning research, and feature teams (Autoplay, Home, Search, etc.) to drive the active adoption of the serving infrastructure. • Scale up the serving architecture to handle hundreds of millions of users and high-volume inference requests for internal domain-specific LLMs. • Drive Latency and Cost Optimization: partner with SRE and ML teams to implement techniques like quantization, pruning, and efficient batching to minimize serving latency and cloud compute costs. • Develop Observability and Monitoring: build dashboards and alerting for service health, tracing, A/B test traffic, and latency trends to ensure consistency to defined SLAs. • Contribute to Core LPM Serving: focus on the technical strategy for deploying and maintaining the core Large Personalization Model (LPM).

Similar Jobs

Actian CorporationActian Corporation - Senior Engineering Manager [gn] Connectivity and Integrations6h ago
·Europe, Remote - Hybrid
In OfficeEMEAStaffData AnalyticsSoftwareEngineering ManagerIntegration Engineer
ApellaApella - Engineering Manager, Machine Learning & Data Platforms6h ago
·Remote - United States·$200k - $250k/year
RemoteNAStaffArtificial IntelligenceData AnalyticsEngineering ManagerMachine Learning EngineerMLOpsObservable
ION GroupION Group - IT Security Engineering Manager6h ago
·London
In OfficeEMEAStaffCybersecurityEngineering ManagerSecurity EngineerDocumentationTeam ManagementLinuxReportingMentoringGovernance
Bloom & Wild GroupBloom & Wild Group - Machine Learning Engineer6h ago
·Remote - Europe
RemoteEMEACloud ComputingArtificial IntelligenceMachine Learning EngineerPythonDocumentationAWSMLOpsdbtSQLSnowflakeE-commerce
DNSFilterDNSFilter - Manager, Quality Engineering7h ago
·Remote - USA·$140k - $150k/year + Equity
RemoteNAMidPrivate EquityVenture CapitalEngineering ManagerTeam ManagementTraining DevelopmentLearning & DevelopmentBacklog ManagementCoachingJira
Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact
Loading...