Wizard - Senior Machine Learning Engineer (Inference Platform)
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field, or equivalent experience. • 5-8+ years of experience in Software Engineering, ML Engineering, Platform Engineering, or Infrastructure Engineering with direct ownership of production ML serving systems. • Hands-on experience deploying and maintaining LLMs and deep learning models, in production environments. • Strong Python skills and software engineering fundamentals with infrastructure depth. Familiarity with ML frameworks (PyTorch, Tensorflow or similar) is preferred. • Experience with cloud platforms such as AWS, GCP, or Azure, and familiarity with ML lifecycle tooling, including model registries and experimentation platforms. • Familiarity with inference optimization at the hardware and systems level – batching strategies, memory management, quantization tradeoffs, CPU/GPU interaction patterns. • Demonstrated ability to reason about tradeoffs between latency, cost, throughput, and reliability at the systems as well as operational level. • Experience in high-growth startup environments and an ability to thrive in a fast-paced, evolving technical landscape. • What Success Looks Like • What Success Looks Like • Reliable, Scalable ML Systems: Production models run with clear SLAs, minimal downtime, and full observability – latency, availability, and GPU utilization tracked and enforced. Deployment pipelines handle growth and evolving AI requirements. • Reliable, Scalable ML Systems: • End-to-End Ownership: You own the full ML lifecycle – from packaging and deployment through monitoring and optimization – enabling ML engineers to iterate quickly while maintaining reproducibility, reliability and security. • End-to-End Ownership: • Influence and Impact: You shape the technical roadmap for ML operations, collaborating with ML, Data, and DevOps teams to improve system performance, reduce operational costs, and drive the overall AI strategy forward • Influence and Impact:
Responsibilities
• Build and improve production ML pipelines, making it easy to move models from experimentation to reliable production use • Help own and evolve our multi-engine inference platform (LLMs, embeddings, and extraction), improving how different workloads are served and scaled • Put strong foundations in place for model versioning, rollouts, and rollbacks so systems stay reproducible and safe to iterate on • Define and monitor key system metrics like latency, availability, and GPU utilization, and set clear expectations around performance • Improve overall system performance — whether that’s reducing latency, increasing throughput, or making better use of GPU resources • Design systems that are resilient and cost-aware, with thoughtful approaches to autoscaling, failure isolation, and graceful degradation • Bring solid engineering practices (testing, CI/CD, observability) into ML workflows to help the team move faster without sacrificing reliability • Partner closely with ML, Data, Product, and DevOps to turn ideas into production-ready systems and help guide technical decisions
Benefits
• The expected base salary range for this role is $200,000 – $250,000 USD, and will vary based on skills, experience, role level, and geographic location. Final compensation will be determined by considering these factors alongside overall role scope and responsibilities. • In addition to base salary, Wizard offers: • Equity in the form of stock options • Medical, dental, and vision coverage • Flexible PTO and company holidays • Fully remote work within the United States • Periodic company offsites and team gatherings • Wizard is committed to fair, transparent, and competitive compensation practices.
No credit card. Takes 10 seconds.