Artera - Machine Learning Engineer (AI Platform Lead)
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• 8+ years of industry software engineering experience • 4+ years of industry experience in using ML orchestration frameworks such as Flyte, Ray, Kubeflow, Metaflow, MLFlow, Dagster, Argo Workflow or Prefect • 4+ years of industry experience using one of PyTorch, TensorFlow, or JAX in Python • 3+ years of industry experience building with AWS, Docker, and Kubernetes • 1+ years of industry experience optimizing large-scale, high data-throughput, distributed machine learning training pipelines • Experience in multi-node and multi-gpu training. • Experience deploying and maintaining infrastructure for machine learning training and production inference • Familiarity with TorchScript, ONNXRuntime, DeepSpeed, AWS Neuron or similar approaches to inference optimization • Work Authorization Requirement: • This is a remote role open to candidates who are currently authorized to work either in the United States or in Canada without the need for current or future employment-based visa sponsorship. Artera does not sponsor visas or support visa transfers for this position. • Eligible candidates may include: • Individuals authorized to work in the United States on a permanent basis (e.g., U.S. citizens, U.S. permanent residents), or • Individuals authorized to work in Canada (e.g., Canadian citizens or Canadian permanent residents). • $180,000 - $220,000 a year • In addition to base salary, equity is a core component of our compensation. We also offer 401k matching, unlimited paid time off (PTO), and more. • The base salary is competitive and commensurate with experience, qualifications, and other factors to be discussed during the interview process.
Responsibilities
• Develop the long term vision and roadmap for Artera’s AI platform that will allow the company to continue to scale in terms of both increased inference volume and development workloads. • Accountable for Artera’s ML compute infrastructure including scaling up Artera’s Foundation Model development by developing distributed training infrastructure and developer libraries. • Build and evolve the core libraries used by AI scientists to develop, launch, and monitor AI products. • Work with model developers to optimize GPU and CPU efficiency and data throughput of large-scale foundation models and downstream model training runs. • Optimize Artera’s ability to store and serve terabytes of digital pathology data efficiently for the use in serving large-scale training regimes. • Ensure that Artera’s observability infrastructure provides a clear picture of how to continue to optimize performance across our model landscape.
Benefits
• Equity options available to eligible employees. • Paid time off (PTO). • Comprehensive insurance coverage for the employee and their dependents.
No credit card. Takes 10 seconds.