5+ years of relevant experience, with a degree in Computer Science, Engineering, Mathematics, or a related technical field.
Strong software engineering fundamentals, with proficiency in Python and SQL, and strong working knowledge of Git and modern CI/CD workflows.
Hands-on experience with ML experimentation and model tracking tools.
Strong proficiency with model monitoring and observability tooling.
Experience with ML infrastructure and orchestration technologies, such as Docker, Kubernetes, and workflow orchestration frameworks.
Familiarity with model serving and deployment frameworks.
Proven experience deploying and operating machine learning models as production services, with an emphasis on reliability and performance.
Demonstrated ability to build 0-to-1 prototypes and proof-of-concepts, rapidly standing up ML services and experimentation environments.
Experience designing, building, and optimizing ML pipelines for training, evaluation, and deployment.
Highly adaptable and able to learn quickly in fast-moving environments with evolving technical requirements.
Candidates must be legally authorized to work in the United States and must live in the United States.
Responsibilities
Own SentiLink’s real-time ML model monitoring domain, leading the design, implementation, and ongoing improvement of monitoring systems and workflows.
Own our ML experimentation, model tracking, and versioning infrastructure, ensuring strong reproducibility and visibility across the model lifecycle.
Drive improvements to the model development process, reducing inefficiencies, improving code quality, resolving DS tooling gaps, and enabling faster iteration.
Serve as the primary technical owner of key touchpoints and interfaces between Data Science and Engineering/Infrastructure, defining standards and workflows.
Support efforts to optimize model behavior in production, including latency, reliability, maintainability, and operational best practices.
Investigate and diagnose model performance issues on an ad-hoc basis, including partner escalations and analysis of model behavior in real-world scenarios.
Evaluate, prototype, and recommend new ML infrastructure, tools, and data capabilities, partnering with DS to validate impact and support adoption.