wagey.ggwagey.gg
38,923  jobs38,923  jobs
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs(38,923)/ML Engineer Role(153)/cosine (5) - ML Systems Engineer - Model Training and Infrastructure (SWE-focused LLMs)
cosine

cosine - ML Systems Engineer - Model Training and Infrastructure (SWE-focused LLMs)

London, England, United Kingdom£80k - £110k/year+ Equity3mo ago
In OfficeEMEACloud ComputingArtificial IntelligenceML EngineerSystems EngineerCustomer TrainingDockerPythonGoKubernetes

Requirements

• Strong software engineering or computer science background: • You can read, debug, and write non-trivial production code (you’ll mainly be working across Python and Go). • Experience with tools like Docker and container management/orchestration platforms, like Kubernetes • Experience with at least one major cloud-computing platform like GCP, AWS or Azure • You care about code quality, correctness, and maintainability as much as model metrics. • Knowledge of PyTorch/Tensorflow/JAX: • Comfortable implementing custom training loops, losses, and dataloaders. • Data engineering instincts: • Comfortable working with large-scale datasets, object storage, dataset sharding, and filtering. • Know that data quality and sampling strategies matter as much as architecture. • Clear communication and ownership: • Can take a vague modelling goal (“make Lumen better at X”) and turn it into a concrete plan of experiments. • Comfortable documenting decisions and walking others through tradeoffs. • You don’t need all of these, but the more you have, the more you’ll hit the ground running: • Experience with synthetic data generation pipelines • Experience with data tooling like SQL, Apache Iceberg and duckDB • Experience training LLMs in distributed environments • Safety, robustness, and reward shaping: • Experience with LLM-as-a-judge, reward hacking detection, or robustness evaluation. • Open-source contributions or research: • Contributions to open-source LLM tooling, RL libraries, etc. • ___________________________________________________________________________

Responsibilities

• Develop and manage synthetic data generation pipelines to curate datasets that will underpin future RL fine-tunes. • Design, build and deploy containerized services using Docker and platforms like Kubernetes to enable our RL infrastructure. • Build and iterate on large-scale RL loops where models write code, run tests or tools, and get rewarded (or penalized) accordingly. • Work hands-on across the stack: custom PyTorch dataloaders, RL objectives, and evaluation on real-world repos and tasks. • You’ll collaborate closely with infra, product, and research to decide what to train next, how to train it, and how to measure whether it’s actually better for engineers. • ___________________________________________________________________________ • Participate in end-to-end training of models: • Supervised fine-tuning on curated code and conversation datasets. • RL on top of those models to align them with software-engineering objectives. • Architect synthetic data generation pipelines for RL and deploy using containerization technologies. • Ideate on novel and opinionated reward functions for the training of SWE agents. • Improve evaluation for SWE models: • Help maintain/extend an evaluation suite for code models (unit tests, benchmark suites, repo-level tasks). • Analyze failure modes and feed them back into data and training plans.

Benefits

• £80K – £110K • £80K – £110K Equity • Upload your resume here to autofill key application fields. • Drop your resume here! • Parsing your resume. Autofilling key fields... • Please kindly share your base salary expectations in GBP • Please let us know what would help.(This information is used only to support the recruitment process and is not part of selection decisions.) • Please let us know what would help. • We are open to providing visa sponsorship where appropriate. • I have the right to work in the UK without the need for visa sponsorship • I do not currently have the right to work in the UK and would require visa sponsorship (I am currently based in the UK) • I do not currently have the right to work in the UK and would require visa sponsorship (I am currently based outside the UK) • or drag and drop here • Cosine may use Artificial Intelligence with this application. Learn more.

Apply in one click

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Similar roles

VoodooVoodoo - Senior ML Engineer2mo ago
·Remote - Paris, Île-de-France, France
RemoteEMEASeniorCloud ComputingArtificial IntelligenceML EngineerPythonPrometheusGrafanaKubernetesDocker
neko-healthneko-health - ML Ops Engineer1w ago
·London - Hybrid
In OfficeEMEAArtificial IntelligenceML EngineerPythonKubernetesTerraform
SewerAI CorporationSewerAI Corporation - ML Ops Engineer (AI)1mo ago
·Remote - USA *·$130k - $16k/year + Equity
RemoteNAMidCloud ComputingArtificial IntelligenceML EngineerPythonAWSDockerKubernetesTerraform
swapswap - Lead ML Engineer (recommendation systems)2mo ago
·London, United Kingdom, Hybrid·Equity
In OfficeEMEAStaffArtificial IntelligenceML EngineerPython
hostingerhostinger - System Engineer | Web Hosting | Based in Europe1mo ago
·Remote - Warsaw, Poland
RemoteEMEAMidCloud ComputingNonprofitSystems EngineerGoPythonPHPLinuxAnsible
pragmatikepragmatike - ML Ops Engineer (EMEA Remote)2mo ago
·Remote - Albania, Armenia, Bosnia & Herzegovina...
RemoteEMEAMidCloud ComputingArtificial IntelligenceML EngineerHelmPythonTerraformTritonMLOps
improbableimprobable - AI/ML Engineer1mo ago
·London, United Kingdom
In OfficeEMEASeniorCloud ComputingArtificial IntelligenceAI EngineerML EngineerPythonFastAPIFull StackDockerNeo4j
growthprotocolgrowthprotocol - Forward Deployed Engineer1mo ago
·London, United Kingdom, Hybrid·Equity
In OfficeEMEASeniorCloud ComputingArtificial IntelligenceSolutions ArchitectSystems EngineerPythonGCPDockerFastAPIElasticsearch
Yuma AIYuma AI - Senior DevOps / Infrastructure & AI LLM Systems Engineer (Hybrid)5mo ago
·Barcelona, Catalonia, Spain - Hybrid·Equity
In OfficeEMEASeniorCloud ComputingE-commerceSoftwareArtificial IntelligenceSystems EngineerSenior DevOps EngineerRubyKubernetesPythonAWSGCP

Browse more by category

Show 153 moreML EngineerShow 227 moreSystems EngineerShow 291 moreCustomer TrainingShow 1,087 moreDockerShow 6,338 morePythonShow 2,085 moreGoShow 1,928 moreKubernetes
Privacy·Terms··Contact·FAQ·Wagey on X