Foundation EGI - ML Ops Engineer (Boston, MA)
Requirements
• Architect, build, and operate end-to-end ML pipelines for training, validation and deployment on Google Cloud and AWS. • Define, instrument, and maintain logging, monitoring, and alerting for model performance and data drift. • Automate CI/CD for ML artifacts and infrastructure using GitHub Actions or equivalent. • Collaborate with cross-functional teams, including frontend engineers, backend engineers, research engineers, and infrastructure engineers. • Write clean, well-documented, fast, and maintainable code. • Help ensure our systems have high availability and performance. • Experience in computer graphics or physics-based simulation. • Background in setting up Prometheus/Grafana, ELK, or similar monitoring stacks. • Experience working with custom Domain-Specific Languages. • BS in Computer Science or a related field. • 5+ years of experience as a AI/ML Ops, DevOps, Infrastructure Engineer or equivalent. • Expert-level Python and TypeScripts skills. • Experience with Docker, Kubernetes, Terraform, Google Cloud and AWS. • Deep understanding of machine learning models, including LLMs. • Experience designing and maintaining CI/CD pipelines to fine-tune or train ML models. • Excellent written and verbal communication skills. • Google Cloud, AWS • Python, TypeScript • Next.JS, React.JS • Docker, Kubernetes, Spinnaker • We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Apply in one click
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT