You don’t just build features—you build products. You can hold a cohesive vision for developer experience, make sharp tradeoffs, and ship workflows that feel obvious in hindsight.
Full stack, Python-first, systems-capable
You’re deeply fluent in Python and comfortable across the stack: backend APIs, storage, distributed systems, and front-end surfaces when needed. You understand containers and runtime environments at a real level—Docker isn’t magic to you.
ML dev tools instinct + high technical aptitude
You have unusually strong technical intuition. You spot leaky abstractions, design APIs that age well, and build tooling that makes power users faster. You care about correctness, performance, and sharp interfaces.
Comfortable with ML fundamentals (RL a big plus)
You don’t need to be an academic, but you should be able to reason about core ML concepts and math. Ideally you’ve trained models, worked with post-training workflows, or understand reinforcement learning well enough to build tools around it.
You care about evaluation as a first-class product surface. Experience with Inspect or other evaluation harnesses/frameworks is a major plus, as is building eval and training pipelines for LLMs, agents, or multimodal systems.
AI-native builder
You’re proficient with modern AI coding tools and agentic workflows (Cursor, Copilot, Claude/ChatGPT-style assistants, eval-driven iteration, etc.). You move fast without sacrificing rigor.
Customer-minded engineer
You can jump into a customer’s environment, understand what’s broken, propose a solution, and deliver it—while also feeding insights back into the roadmap. You can communicate clearly with both engineers and researchers.
Build and iterate on our Python SDK: clean APIs, excellent docs, great errors, sharp defaults, and extensibility.
Create “golden path” workflows for common user goals: creating post-training data, launching RFT runs, evaluating results, and iterating quickly.
Ship eval-native workflows
Help build eval pipelines for LLMs/agents that connect naturally to post-training loops (capability measurement → data creation → training → re-eval).
Go deep on systems and reliability
Build with Docker, Linux, and cloud infrastructure in mind; ensure consistent environments across local, CI, and production.
Improve performance, observability, and debuggability of job execution and data pipelines.
Contribute to Kubernetes deployment patterns and scaling.
Work directly with customers
Partner with engineers and researchers at startups, enterprises, and foundation labs.
Active participation in the ML community (open-source contributions, writing, research engagement, etc.).
What Success Looks Like
You own product features end to end, working autonomously with the team to improve core customer experiences.
The SDK feels intuitive and inevitable: sharp, consistent, well-documented, hard to misuse.
The platform is dependable under real-world load: reproducible runs, fast performance, clear logs, great debugging, smooth onboarding.
Customers feel like HUD gives them leverage—your work directly drives adoption and retention.
Why You’ll Love It Here
High talent density, low ego. You’ll work with unusually strong peers who care about craft.
Real ownership. You’ll shape the core product and its direction at a pivotal stage.
Hard problems with real impact. Post-training, evals, and RL workflows are still early—your work defines best practices.
Move fast, build right. We care about speed and quality, and we invest in doing things correctly.
Locations: San Francisco / Singapore
Type: Full-time
Visa/Relocation: Available for strong candidates (US/Singapore)
Compensation: $150,000-$240,000 salary, meaningful equity, full healthcare, daily team meals.
Responsibilities
Design Python SDK for ML engineers.
Build APIs and workflows used by ML engineers daily to create data, run post-training, and evaluate systems.
Operate across the stack from backend systems and infrastructure design to frontend surfaces when needed.
Work directly with customers (engineers and researchers at startups, enterprises, and labs) to understand their workflows, unblock them, and ship what they need.
Ensure tooling is reliable, reproducible, and scalable for reinforcement fine-tuning tasks.
Maintain a high bar of quality in product development with an emphasis on ergonomics and intuitive user experience.
Benefits
High talent density, low ego. You’ll work with unusually strong peers who care about craft.
Real ownership. You’ll shape the core product and its direction at a pivotal stage.
Hard problems with real impact. Post-training, evals, and RL workflows are still early—your work defines best practices.
Move fast, build right. We care about speed and quality, and we invest in doing things correctly.
Locations: San Francisco / Singapore
Type: Full-time
Visa/Relocation: Available for strong candidates (US/Singapore)
Compensation: $150,000-$240,000 salary, meaningful equity, full healthcare, daily team meals.