Abridge - Machine Learning Infrastructure Engineer- Model Inference
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• Strong experience in building and deploying machine learning models in production environments. • Deep understanding of container orchestration and distributed systems architecture • Expertise in Kubernetes administration, including custom resource definitions, operators, and cluster management • Experience developing APIs and managing distributed systems for both batch and real-time workloads • Excellent communication skills, with the ability to interface between research and product engineering • Expertise with model serving frameworks such as NVIDIA Triton Server, VLLM, TRT-LLM and so on. • Expertise with ML toolchains such as PyTorch, Tensorflow or distributed training and inference libraries. • Familiarity with GPU cluster management and CUDA optimization • Knowledge of infrastructure as code (Terraform, Ansible) and GitOps practices • Experience with container registries, image optimization, and multi-stage builds for ML workloads • Experience orchestrating across ASR models or LLM models for building various GenAI applications
Responsibilities
• Design, deploy and maintain scalable Kubernetes clusters for AI model inference and training • Develop, optimize, and maintain ML model serving infrastructure, ensuring high-performance and low-latency. • Collaborate with ML and product teams to scale backend infrastructure for AI-driven products, focusing on model deployment, throughput optimization, and compute efficiency. • Optimize compute-heavy workflows and enhance GPU utilization for ML workloads. • Build a robust model API orchestration system • Collaborate with leadership to define and implement strategies for scaling infrastructure as the company grows, ensuring long-term efficiency and performance.
Benefits
• Compensation is market-based and reflects the cost of labor across different U.S. geographic locations. We've structured the base pay ranges into tiers for our geographic markets. The specific base pay is based on several factors, including market location, and may vary depending on job-related knowledge, skills, and experience. • Upload your resume here to autofill key application fields. • Drop your resume here! • Parsing your resume. Autofilling key fields... • If you do not have one, please just enter your legal first name • or drag and drop here • I am currently living in the San Francisco Bay or New York areas • I do not currently live in New York, San Francisco - but I am willing to relocate within 6 months • I do not currently live in New York, San Francisco - but I am willing to travel 20% • I do not currently live in New York, San Francisco- I am NOT willing to relocate and am only open to 100% remote positions • AI Use Policy: Abridge explicitly forbids candidates from using AI tools during live interviews, including teleprompter-style AI tools, chatbots, or coding assistants. While AI is prohibited during the interview process, candidates are encouraged to discuss their experience using AI applications to achieve efficiencies in projects or their work. • AI Use Policy: • Friend or colleague • Recruiting event or dinner • Tech blog or article • Press or news coverage
No credit card. Takes 10 seconds.