wagey.ggwagey.gg
Open Tech JobsCompaniesPricing
Log InGet Started Free
© 2026 Dominic Morris. All rights reserved.·Privacy·Terms·
Jobs/Python Jobs/RL Environments Engineer

RL Environments Engineer

Preference ModelRemote - PT (Pacific)+ Equity1mo ago
RemoteNAArtificial IntelligenceSolutions EngineerPythonDockerTraining DevelopmentCUDA

Upload My Resume

Drop here or click to browse · PDF, DOCX, TXT

Apply in One Click

Requirements

  • Strong Python skills with an engineering quality focus. Docker experience is preferred but not mandatory. Understanding of LLMs and their limitations required. Must meet throughput expectations and respond quickly to feedback. Ability in CUDA/Pallas kernel development or expert knowledge in a specific DL research area, such as architectures (SSMs, KANs), generative modeling (diffusion, flow matching), reasoning methods (neuro-symbolic methods), mechanistic interpretability techniques (circuit analysis, causal discovery), foundations of learning theory and control optimization. Experience with ML for science applications like physics-informed neural nets or computational neuroscience is a plus. Familiarity with numerical & simulation methods such as stochastic time series modeling, fluid dynamics simulations, Bayesian inference techniques, Monte Carlo methods also beneficial but not required. Broad research interests and the ability to translate complex papers into RLVR problems are preferred qualifications for this role.

Responsibilities

  • Design MLE environments for LLMs to learn better reasoning/advanced concepts from modern ML.
  • Optimize non-trivial neural modules using CUDA or Pallas kernel development skills if applicable.
  • Develop and maintain RLVR problems based on current research areas in AI, with a focus on math-heavy topics that do not require massive compute resources. Examples include architectures like SSMs, KANs, tensor networks, Hypernetworks; generative modeling techniques such as diffusion, flow matching, probabilistic programming; and reasoning methods including neuro-symbolic approaches and algorithmic reasoning.
  • Contribute to the creation of RL environments where models encounter research and engineering problems for iterative learning with realistic feedback loops.
  • Possess a clear understanding of LLMs' current limitations and be able to meet throughput expectations while responding quickly to feedback, potentially involving circuit analysis or causal discovery if relevant tasks are assigned.

Benefits

  • Remote work options are clearly mentioned: "This is a remote contractor role." This indicates that candidates can expect to work remotely, which may be considered an indirect benefit in terms of flexibility and commuting time savings.

Similar Jobs

Applied Scientist 2
6h ago
realitydefenderrealitydefender·Remote - USA Remote·$110k – $160k/year
RemoteMidNAApplied ScientistPythonPyTorch
Support Engineer
6h ago
fablefable·Remote - USA *·$120k – $160k/year + Equity
RemoteMidNACybersecuritySoftwareSupport EngineerCustomer RelationsJavaScriptPythonCross-functional CollaborationAccount ManagementJiraDocumentationZendeskIntercom
Partner Solutions Engineer
6h ago
WebflowWebflow·Remote - CA Remote (BC & ON only); U.S. Remote·$179k – $210k/year + Equity
RemoteMidNASolutions EngineerJavaScriptTechnical Writing
Data Analyst II
6h ago
computercarecomputercare·Remote - Anywhere - USA *
RemoteMidNAData AnalystReportingSQLData AnalysisPythonTableauPower BIExcelPandasData VisualizationHexLookerData Quality
Associate Data Scientist
6h ago
pointclickcarepointclickcare·Remote, USA - Hybrid
In OfficeJuniorNACybersecurityData AnalyticsData ScientistAssociateMicrosoft OfficeExcelPythonTableauSQLData VisualizationCustomer Success

Stop filling. Start chilling.Start chilling.

Get Started Free

No credit card. Takes 10 seconds.