wagey.ggwagey.ggv1.0-68eec7a-3-May
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs/Research Scientist Role/sully-ai - Applied Research Scientist
sully-ai

sully-ai - Applied Research Scientist

Remote - US2mo ago
RemoteJuniorNAArtificial IntelligenceResearch ScientistApplied ScientistPythonHugging FaceTechnical Writing

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Requirements

• Proven experience designing agentic processes and LLM evaluation/benchmarking frameworks. • Strong Python and ML background (PyTorch/TensorFlow, Hugging Face, LangChain/LlamaIndex). • Demonstrated ability to design rigorous experiments and translate findings into production. • Track record of published research or deep applied work in LLMs and agent evaluation. • Strong communication and technical writing skills to articulate complex findings clearly. • First-Month Focus • Audit existing evaluation approaches for clinical and agentic tasks. • Define initial benchmarks and build early automated pipelines. • Partner with engineering to land first set of CI gates for accuracy, factuality, and safety. • Deliver a repeatable evaluation framework with automated pipelines in production. • Demonstrate measurable improvements in robustness, hallucination reduction, or safety. • Publish or present internal research findings that directly shape product reliability. • If you’ve ever said, “I want to do work that actually matters”, this is it. Let’s build something life-changing, together. • KEY RESULTS (FIRST 90 DAYS) • Entrepreneurial to your core: You think in outcomes, thrive in chaos, and take ownership without limits • Mission-obsessed: You’re here to save lives, not just ship features — patients and doctors are your why. • Impact-driven & fast-moving: You sprint toward hard problems and ship with sharp judgment. • Elite teammate: You raise the bar through high standards, direct feedback, and craft excellence.

Responsibilities

• Build and scale automated evaluation pipelines (LLM-as-judge + human review) with clinical-grade benchmarks.

Benefits

• 🔥 Revolutionizing the antiquated $800B+ Healthcare market • 🧠 50%+ of us are ex-founders. We hire A-players, not passengers • ⚡️ Speed matters - we operate with urgency, autonomy, and ownership • 🧪 You’ll work on real, first-of-their-kind problems at the edge of AI and medicine • ❤️ Your work helps doctors reclaim their time - and patients get better, faster care

Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact·FAQ·Wagey on X