Iambic Therapeutics, Inc - Machine Learning Scientist — Agentic data pipelines
Requirements
• Master's or PhD in a computational STEM field, or equivalent industry experience • Strong Python engineering skills, including experience building and maintaining production-quality software • Hands-on experience with LLM APIs (e.g., Claude, GPT) and agentic patterns such as tool use, orchestration, and multi-step reasoning • Familiarity with biomedical or chemical data sources and formats (e.g., PDB, UniProt, ChEMBL, SDF/MOL, FASTA, or similar) • Comfort with data engineering fundamentals: ETL design, data validation, and working with structured and unstructured data at scale • Desired: • Experience with agent orchestration frameworks • Familiarity with cloud infrastructure and workflow orchestration (e.g., AWS, Docker, Kubernetes) • Knowledge of multimodal biomedical data—spanning small molecules, proteins, assays, images, ‘omics, and/or clinical records • Experience with large-scale dataset construction or curation for ML model training • Location • Remote (US or UK). On-site available in Bristol, UK and Boston, US. • Iambic is a clinical-stage life-science and technology company developing novel medicines using its AI-driven discovery and development platform. Based in San Diego and founded in 2020, Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters. The Iambic platform has demonstrated delivery of new drug candidates to human clinical trials with unprecedented speed and across multiple target classes and mechanisms of action. Iambic is advancing a pipeline of potential best-in-class and first-in-class clinical assets, both internally and in partnership, to address urgent unmet patient need. Learn more about the Iambic team, platform, pipeline, and partnerships at iambic.ai. • MISSION & CORE VALUES • Our mission is to deliver better medicines through innovations in AI-based discovery technologies. The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies.
Responsibilities
• Design, build, and maintain agentic systems for automated data acquisition from public and proprietary biomedical data sources • Develop LLM-based pipelines for data cleaning, normalization, and formatting across diverse data modalities (e.g., molecular, genomic, clinical, literature) • Implement automated quality-control workflows that detect anomalies, flag inconsistencies, and enforce data standards • Evaluate and iterate on agent architectures, prompting strategies, and tool-use patterns to improve reliability and throughput • Collaborate with ML scientists on the Enchant team to understand data requirements and translate them into scalable acquisition and processing systems • Monitor and maintain data pipelines in production, diagnosing failures and improving robustness over time • Document data provenance, processing decisions, and quality metrics to support reproducibility and auditing
Benefits
• We offer industry leading competitive pay, company paid healthcare, flexible spending accounts, voluntary life insurance, 401K matching, and uncapped vacation to our team. We are in a brand-new state-of-the art facility in beautiful San Diego with an onsite gym, dining, and easy access to great places to live and play.
Apply in one click
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT