Iambic Therapeutics, Inc - Platform Engineer, CloudOps Infrastructure
Requirements
• Infrastructure-as-code (Terraform/OpenTofu & Terragrunt) • AWS (ECS and related services) • Automated CI/CD with containerized services • Python with modern tooling (e.g. uv, pixi) • Workflow orchestration (e.g. Prefect); GPU-backed compute for ML workloads • FastAPI with a clean service-layer architecture (business logic isolated from transport) • Agentic coding tools (e.g. Claude Code) as part of day-to-day development • Iambic is a clinical-stage life-science and technology company developing novel medicines using its AI-driven discovery and development platform. Based in San Diego and founded in 2020, Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters. The Iambic platform has demonstrated delivery of new drug candidates to human clinical trials with unprecedented speed and across multiple target classes and mechanisms of action. Iambic is advancing a pipeline of potential best-in-class and first-in-class clinical assets, both internally and in partnership, to address urgent unmet patient need. Learn more about the Iambic team, platform, pipeline, and partnerships at iambic.ai. • MISSION & CORE VALUES • Our mission is to deliver better medicines through innovations in AI-based discovery technologies. The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies.
Responsibilities
• Build and maintain standardized, reproducible, secured deployment templates. • Develop and operate the orchestration layer (orchestration workflows, e.g. Prefect) and the GPU-backed compute paths that run training and inference, on a schedule and on demand. • Own infrastructure-as-code, CI/CD pipelines, and the tooling that makes standing up, updating, and tearing down an environment routine and auditable. • Build observability - metrics, logging, alerting - that gives a clear picture of system health across environments. • Run and improve the system day to day (DevOps/CloudOps): drive operational practices that emphasize stability, predictability, and low overhead, and partner with the infosec function on infrastructure security posture. • Adapt and extend the platform as ML researchers introduce new workflows and models - turning new requirements into supported, repeatable capabilities. • Evaluate and integrate orchestration and deployment technologies; prototype, then harden the patterns that work. • Participate in design discussions, code reviews, and documentation. • Strong Terraform/OpenTofu engineering skills and hands-on AWS (or comparable cloud) experience. • Production experience with containerization, container orchestration (e.g. ECS), and CI/CD pipelines. • Infrastructure-as-code and reproducible environments. • Solid understanding of distributed systems fundamentals. • Strong operational instincts: observability, debuggability, and maintainability. • Experience operating multi-tenant or large-fleet platforms preferred. • Experience with workflow orchestration (Prefect, Airflow, Dagster, or similar) and/or GPU compute platforms preferred. • Familiarity with GPU-backed environments and ML training/inference pipelines preferred. • Awareness of infrastructure security posture and compliance frameworks (e.g. ISO/IEC 27001 or similar) preferred. • Strong written communication and the ability to work effectively in a distributed team.
Apply in one click
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT