wagey.ggwagey.gg
Open Tech JobsCompaniesPricing
Log InGet Started Free
Jobs/Data Engineer Role/ML Data Engineer
Pro members applied to this job 36 hours before you saw itGet Pro ›

ML Data Engineer

RecraftLondon, Greater London, United Kingdom3d ago
In OfficeEMEAOil & GasData EngineerPythonKubernetesDocker

Upload My Resume

Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• Must-have • Strong Python fundamentals; you write clean, maintainable, production-ready code. • Python • Solid hands-on Kubernetes experience (containers, jobs, batch/distributed processing). • Kubernetes • Proven track record with unstructured data, especially images (loading, filtering, transforming at scale). • unstructured data • images • Experience developing data-ingestion or parsing tools for publicly accessible sources, including handling real-world reliability and failure cases gracefully. • Comfort with S3/object storage and moving lots of data efficiently and safely. • S3/object storage • Pragmatic, detail-oriented, ownership mindset; you enjoy making systems reliable and fast. • Nice-to-have • Familiarity with ML workflows (PyTorch) and downstream training considerations. • Experience with image quality scoring, captioning, or image-to-text pipelines. • DAG/workflow visualizations or pipeline UX tooling. • DevOps fluency: Docker, CI/CD, infra automation.

Responsibilities

• Develop and maintain data-ingestion pipelines to source and prepare large-scale image (and occasional text/HTML) datasets from open, publicly accessible, and permitted sources. • Own the end-to-end flow: raw data → quality/beauty/relevance filtering → dedup/validation → ready-to-train artifacts.Operate and improve our Kubernetes-based data-pipeline framework (distributed jobs, retries, monitoring, automation). • Kubernetes-based • Work with S3-style object storage: efficient layouts, lifecycle, throughput, and cost awareness. • S3-style object storage • Add tooling around pipelines (progress/health visualization, metrics, alerts) for observability and faster iteration. • Collaborate closely with ML engineers to align datasets with training needs and accelerate experimentation.

Benefits

• We’re able to offer Skilled Worker visa sponsorship in the UK for qualified candidates. • Real impact on model quality: your pipelines directly power training runs and product improvements. • Real impact on model quality: • Ownership with support: autonomy to design and improve systems, alongside experienced ML peers. • Ownership with support: • Modern stack: Python, Kubernetes, S3, internal pipeline framework built for scale. • Modern stack: • Growth: a fast-moving environment where shipping well-engineered systems is the norm. • Growth:

Similar Jobs

Senior Data EngineerJust now
The ZebraThe Zebra·Austin, Texas, United States - Hybrid·$150k – $170k/year + Equity
In OfficeNASeniorSenior Data EngineerKafkaGitLabSQLPythonLinuxSnowflakeRedshiftJavaScriptGitHub
AI Automation EngineerJust now
DevRevDevRev·Remote - Philippines Remote
RemoteAPACArtificial IntelligenceNonprofitAutomation EngineerAI EngineerJavaScriptTypeScriptPythonGeminiClaudeReportingSeleniumCypressPlaywrightDocumentationGovernance
Solutions Architect III, EnterpriseJust now
MapboxMapbox·Remote - Shanghai, Shanghai, China
RemoteAPACPrincipalCloud ComputingLogisticsSolutions ArchitectJavaScriptSwiftKotlinPythonReact NativeFlutterCustomer SuccessVectorDocumentationFront-endBack-endAWSGCPAzure
Solution ArchitectJust now
Insomniac DesignInsomniac Design·Chisinau, Moldova - Europe *
In OfficeEMEAPrincipalCloud ComputingSolutions ArchitectTech LeadGovernancePython.NETJavaAWSGCPAzureNoSQLMentoringData Governance
Scrum MasterJust now
Capstone Integrated SolutionsCapstone Integrated Solutions·Remote
RemoteWWMidCloud ComputingSoftwareScrum MasterSolutions ArchitectCoachingReportingAWSAngularPSMAzureCSMPythonSprint PlanningCross-functional CollaborationConflict Resolution

Stop filling. Start chilling.Start chilling.

Get Started Free

No credit card. Takes 10 seconds.

© 2026 Dominic Morris. All rights reserved.·Privacy·Terms·