joist-ai - Agentic Systems Engineer
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• 45-minute Python proficiency / agentic coding proficiency test. 2 problems. 1 to be coded by hand. Other using Gen AI. • 60 min project, deep dive into the work they have done. A short presentation followed by a Q&A. Presentation should conclude between 20-25 min. • 45 min interview on Gen AI / LLM fundamentals. • 30 min culture fit.
Responsibilities
• Build agents as modular, plug-and-play components that slot cleanly into the wider stack. • Add memory layers (short-term, long-term, summarization, retrieval-backed) into running systems. • Wire up tool integrations, MCP servers, and skills. • Own quality of the features you put out: tests, evals, observability, the works. • Dig into production traces to understand what the system is actually doing, and close the loop with fixes. • BACKGROUND WE'RE LOOKING FOR • 2–4 years of writing production software. • Strong Python skills. You write good Python and can tell good Python from bad, especially now that a lot of code comes out of an LLM. Separation of concerns, clean OOP, idiomatic syntax, well-structured modules, tests that actually test something. • Solid grounding in core agentic and LLM concepts: RAG, prompting patterns, tool use, structured outputs, streaming, context management, basic generative AI fundamentals. • You've built something non-trivial with the modern agent toolkit, whether that's a side project, a prototype at work, or a hackathon thing that got out of hand. • Able to drop into an unfamiliar codebase and find your way around fast. • A keen eye for detail. You sit with a problem before reaching for a solution. No jumping to the shiny fix because it sounds clever. You understand what's actually broken before you touch anything. • Data-driven by default. Decisions come from production traces, eval numbers, and logs, not vibes. Comfortable slicing through trace data to find the real signal. • Hands-on experience with Langfuse or LangSmith (or equivalent tracing/observability for LLM systems). • Genuine curiosity about the frontier. You read the blog posts, try the frameworks, and have opinions about where agent design is headed. • EXPERIENCE WE'D BE PARTICULARLY EXCITED ABOUT • Search and retrieval: embeddings, vector databases, hybrid retrieval, rerankers, and the gap between a retrieval system that demos well and one that survives real data. • LLM evaluations end-to-end: designing evals, choosing what to measure, building the harness, keeping scores honest as models and prompts shift. • LangGraph depth: building custom graphs, understanding checkpointers, working with context-management nodes (summarizers, windowing, state pruning) inside larger agent graphs. • We conduct a rigorous interview process based on integrity, talent, and drive. We trust our teammates from day one and move quickly to evaluate your fit for the role. The entire interview process typically takes two weeks. Here's what to expect:
No credit card. Takes 10 seconds.