deepgram - ML Ops Infrastructure Engineer

Remote - California, United States$160k - $220k2mo ago

Remote Intern NA Artificial Intelligence Infrastructure Engineer ML Engineer Python MLOps Docker Kubernetes Triton

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• 4+ years of experience in MLOps, DevOps, or infrastructure engineering with a focus on ML systems • Strong proficiency in Python and experience building automation and tooling for ML workflows • Deep experience with CI/CD systems and building pipelines for software and model delivery • Hands-on experience with Docker and Kubernetes for containerized workload management • Practical experience deploying and serving ML models in production environments • Familiarity with model evaluation, validation, and quality assurance processes • Understanding of monitoring and observability principles as applied to ML systems • Strong problem-solving skills and a bias toward automation over manual processes • IT WOULD BE GREAT IF YOU HAD • Experience with model serving frameworks such as NVIDIA Triton Inference Server, TensorRT, or ONNX Runtime • Background in speech, audio, or real-time media ML systems • Experience with Infrastructure as Code tools such as Terraform or Pulumi • Hands-on experience with monitoring and observability stacks (Prometheus, Grafana, Datadog, or similar) • Familiarity with GPU-accelerated inference optimization and profiling • Experience with feature stores, data versioning, or ML metadata management • Knowledge of canary deployment strategies and progressive delivery for ML models

Responsibilities

• Design and build CI/CD pipelines specifically tailored for ML model development, validation, and deployment • Architect and maintain model deployment pipelines that move models from research environments through staging to production with confidence • Build A/B testing infrastructure that enables controlled rollouts of new models and measures real-world performance impact • Implement comprehensive monitoring for model performance in production -- accuracy metrics, latency, drift detection, and regression alerts • Develop automated retraining pipelines that trigger on data changes, performance degradation, or scheduled cadences • Create and maintain build and test environments that mirror production, giving researchers high-fidelity feedback before deployment • Establish model versioning, artifact management, and rollback capabilities to ensure safe and reproducible deployments • Collaborate with research engineers to define and enforce model quality gates before production promotion • Build observability dashboards that give the team real-time insight into model health across all environments • Optimize model serving infrastructure for latency, throughput, and cost efficiency • YOU'LL LOVE THIS ROLE IF YOU • Are excited by the challenge of operationalizing cutting-edge AI models at production scale • Believe that great infrastructure is what turns research breakthroughs into customer value • Enjoy designing systems that are automated, reliable, and self-healing • Want to work on problems where minutes of latency reduction or percentage points of accuracy matter enormously • Like collaborating across research and engineering teams to make the whole organization faster • Are motivated by building the deployment and testing systems that back a platform serving over 200,000 developers

Benefits

• HOLISTIC HEALTH • Annual wellness stipend • Mental health support • Life, STD, LTD Income Insurance Plans • WORK/LIFE BLEND • Unlimited PTO • Generous paid parental leave • Flexible schedule • 12 Paid US company holidays • Quarterly personal productivity stipend • One-time stipend for home office upgrades • 401(k) plan with company match • Tax Savings Programs • CONTINUOUS LEARNING • Learning / Education stipend • Participation in talks and conferences • Employee Resource Groups • AI enablement workshops / sessions • For candidates outside of the US, we use an Employer of Record model in many countries, which means benefits are administered locally and governed by country-specific regulations. Because of this, benefits will differ by region — in some cases international employees receive benefits US employees do not, and vice versa. As we scale, we will continue to evaluate where we can create more alignment, but a 1:1 global benefits structure is not always legally or operationally possible. • Backed by prominent investors including Y Combinator, Madrona, Tiger Global, Wing VC and NVIDIA, Deepgram has raised over $215M in total funding. If you're looking to work on cutting-edge technology and make a significant impact in the AI industry, we'd love to hear from you!

Get Started Free

No credit card. Takes 10 seconds.

Requirements

Responsibilities

Benefits