wagey.ggwagey.gg
38,923  jobs38,923  jobs
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs(38,923)/Research Engineer Role(148)/Turing (16) - Research Engineer
Turing

Turing - Research Engineer

Remote - Brazil2mo ago
RemoteMidLATAMSoftwareResearch EngineerC++JavaGoRustPython

Requirements

• 4–5 years of experience building or improving deep learning systems where data quality mattered materially (training, post-training, evals, or agentic systems). • Strong intuition for the “data ingredients” that drive model improvements: what to collect, what to filter, what to synthesize, and how to measure. • Ability to communicate clearly with researchers and engineers: turning research objectives into concrete specs, and turning messy outputs into actionable insights. • Demonstrated ability to be extremely detail-oriented in diagnosing subtle data quality issues and failure modes. • Solid programming ability with a bias for shipping: • Python proficiency required • Comfort with SQL/structured data workflows strongly preferred • For coding-focused work: proficiency in one or more major languages (e.g., C++, Java, Go, Rust, JS/TS) is a plus • Comfort designing quality systems: • Rubrics, validation scripts, gold sets, sampling strategies • Statistical checks and slice-based evaluation • Human-in-the-loop review loops grounded in measurable criteria • Strong pluses • Strong pluses • RL or post-training experience (any of: RLHF/RLAIF, verifier training, reward modeling, RL fine-tuning, environment design). • Experience with agentic evaluation (tool use, multi-step workflows, long-horizon tasks, trajectory analysis). • Multimodal expertise (document understanding, charts, diagrams, OCR, UI/vision grounding; audio/video optional). • STEM depth (math/physics/engineering) with an eye for verifiability and rigorous correctness. • Modern embodied AI / VLM-driven agent experience (vision-language(-action) models, interaction datasets, embodied evals, long-horizon grounding, tool/sensor/action interfaces). • Systems thinking: ability to “simulate” an application’s API/data schema and design tasks that realistically reflect real-world constraints and workflows.

Responsibilities

• 1) Own data and environment quality from an AI researcher perspective • Translate ambiguous research goals into clear data requirements: target skills, failure modes, difficulty calibration, coverage, and success metrics. • Define what “good” looks like by creating detailed rubrics, counterexamples, and boundary cases (what to include vs. exclude). • Perform deep, detail-oriented audits of produced data: spot subtle errors, reward hacking opportunities, leakage, ambiguity, inconsistent assumptions, and distribution shifts. • Drive iterative improvements using evidence: error taxonomies, slice-based quality metrics, and model-behavior-informed refinements. • 2) Design and build datasets and RL environments for your capability area(s) • Contribute to or lead the design of: • Task suites (single-step and long-horizon workflows) • Task suites • Ground-truth signals (verifiers, unit tests, structured checks, reward functions, automatic validators) • Ground-truth signals • Environment interfaces (APIs, tool schemas, state abstractions, database schemas, simulator-like dynamics) • Environment interfaces • Depending on your mapped capability area(s), you may focus on: • Coding / SWE agents: data reflecting real development work (codebase navigation, bug localization, patching, tests, code reviews, CI-like constraints, refactors, security fixes). • Coding / SWE agents: • Multimodality: tasks that test true multimodal reasoning (chart reading, document QA, UI understanding, diagram-based STEM reasoning, OCR-aware tasks). • Multimodality: • STEM: tasks with verifiable solutions (symbolic checks, reference solvers, numerical validation, step consistency, unit sanity). • STEM: • Modern embodied AI / VLM-driven agents: interaction data and environments for vision-language(-action) models (long-horizon tasks, instruction following grounded in visual context, robust action selection, safety/constraint adherence, adversarial state coverage). • Modern embodied AI / VLM-driven agents: • 3) Build robust validation, denoising, and synthetic data systems • Implement automated validation and filtering to achieve frontier-grade signal-to-noise: • Deduplication, decontamination, leakage checks • Consistency checks (format, schema, invariants) • Difficulty and diversity controls (coverage, novelty, long-tail) • Develop synthetic data generation and augmentation pipelines where appropriate: • Programmatic task generators • Controlled perturbations to create hard negatives • Scenario templating with diversity constraints • Simulator-/tool-driven rollouts for trajectory data • Create documentation and data cards: dataset intent, known limitations, recommended use, and evaluation linkage. • 4) Use evaluations and training runs to prove impact • Design and run evals that reflect the customer’s intended usage. • Produce analysis that connects data to outcomes: • Pre/post comparisons on targeted capability slices • Error breakdowns and “why the model failed” narratives • Ablations to identify which data attributes drive lift • When needed, run in-house fine-tuning or RL-style experiments (or partner with research) to demonstrate that the data/environment improves model behavior in measurable ways. • 5) Collaborate effectively with large production teams without being ops-heavy • Work with cross-functional teams (engineers, researchers, QAs, domain SMEs, and large-scale data production groups) by providing: • Clear specs, examples, and edge cases • Fast feedback loops based on audits and quantitative signals • Structured review processes focused on quality, not throughput alone • You are expected to be highly engaged in reviewing and improving outputs from large annotation/creation efforts, but not primarily responsible for hiring, staffing, or people operations.

Benefits

• Work directly with the world’s leading AI labs and enterprises at the cutting edge of post-training and RL environment design. • Real impact (path to AGI): your datasets and environments will directly influence the trajectory toward Artificial General Intelligence and, ultimately, Superintelligence. • Real Impact (GDP): the systems you help build and evaluate target high-value workflows across industries, where even incremental improvements translate to significant productivity gains. • Talent-dense team, where you'll find high autonomy, rapid iteration, and an exceptional learning curve. • Values: • Values: • We are client first: We put our clients at the center of everything we do, because their success is the ultimate measure of our value. • We are client first • We work at Start-Up Speed: We move fast, stay agile and favor action because momentum is the foundation of perfection • We work at Start-Up Speed: • We are Al forward: We help our clients build the future of Al and implement it in our own roles and workflow to amplify productivity. • We are Al forward: • Advantages of joining Turing: • Amazing work culture (Super collaborative & supportive work environment; 5 days a week) • Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience) • Competitive compensation • Flexible working hours

Apply in one click

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Similar roles

wynd-labswynd-labs - Research Crawling Engineer2mo ago
·United States·Equity
RemoteNAArtificial IntelligenceDeveloper ToolsResearch EngineerGoRustJavaC++Python
menlomenlo - Robotics Researcher, Navigation1w ago
·Singapore·Equity
In OfficeAPACRoboticsResearch EngineerCloseC++Python
menlomenlo - Robotics Researcher, Manipulation1w ago
·Singapore - Hybrid·Equity
In OfficeAPACArtificial IntelligenceRoboticsResearch EngineerC++JAXPython
hcompanyhcompany - Research Engineer, Model Inference & Serving2mo ago
·United Kingdom - Hybrid
In OfficeEMEAStaffArtificial IntelligenceMaterialsResearch EngineerStaff EngineerRustC++PythonGoJAX
BeauhurstBeauhurst - Research Manager (Public Sector)2mo ago
·Nottingham/London
In OfficeEMEAMidSoftwarePublic SectorResearch EngineerSPSSTableauTimeline ManagementReportingMarket Research
OKXOKX - AI Agent Security Research Engineer1mo ago
·APAC; Hong Kong, Hong Kong SAR; Singapore, Singapore
In OfficeAPACMidArtificial IntelligenceArchitectureResearch EngineerSecurity AnalystDockerKubernetesJavaGoPython
General RoboticsGeneral Robotics - Research Engineer3mo ago
·Singapore·$64k - $64k/year
In OfficeAPACArtificial IntelligenceRoboticsResearch EngineerC++PythonPyTorchLearning & DevelopmentROAS
helm-aihelm-ai - Research Engineer4mo ago
·Remote - Canada·$150k - $250k/year + Equity
RemoteNASeniorResearch EngineerOptimismTensorFlowPythonC++Base
antimetalantimetal - Research Engineer5mo ago
·New York, NY, United States·$200k - $300k/year + Equity
In OfficeNAMidNonprofitAutomotiveArtificial IntelligenceMaterialsResearch EngineerPythonTypeScriptReportingOutreach

Browse more by category

Show 148 moreResearch EngineerShow 924 moreC++Show 1,848 moreJavaShow 2,085 moreGoShow 732 moreRustShow 6,338 morePython
Privacy·Terms··Contact·FAQ·Wagey on X