wagey.ggwagey.ggv1.0-38ee235-5-May
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs/Data Engineer Role/DeepL - Senior Research Data Engineer
DeepL

DeepL - Senior Research Data Engineer

Berlin, Germany, Hybrid2mo ago
In OfficeSeniorEMEACloud ComputingArtificial IntelligenceData EngineerSenior Data EngineerKubernetesAWSPythonGoData Analysis

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Responsibilities

• Work on ambitious frontier research projects as part of a team consisting of research scientists and research data engineers. • Architect, design and build data pipelines that can handle petabytes of multi-modal unstructured data. • Build a modern data engineering stack grounded in state-of-the-art technology for orchestration and parallel computation, and make extensive use of actively developing open-source solutions. • From the lowest levels of components to the birds-eye view of a system - find performance bottlenecks, debug issues, and create pipelines with a focus on stability. • Leverage our large on-prem data centers and AWS cloud infrastructure for blazing data processing. • Go beyond “Big Data” and ETL, and engineer and operate complex Python data solutions for real-world unstructured data incl. text, code, image and audio modalities. • Collaborate with stakeholders, research scientists, other research data engineers and data tooling and platform teams. • Raise the standard for excellence and act as owner and champion for the quality and availability of our foundation model training data. • Ensure mission-critical reliability of data pipeline jobs, and maintain high quality code. • Play to your strengths and contribute with creativity, thoroughness, pragmatism, foresight, ingenuity, persistence, and every part of you that elevates the team. • Qualities we look for • Professional experience in data, platform or software engineering, ideally with a focus on large-scale unstructured data. • Python: Extensive professional experience in Python software engineering. Ideally, experience in maintaining proprietary or open-source software products. • Data: Experience with exploratory data analysis, cleaning, validation and quality control beyond business intelligence and analytics scale. • Pipelines: Experience with building reproducible pipelines for storing and processing petabytes of data. • Operations: Proficiency in containerization and automatic deployment. Ideally, experience with container orchestration with kubernetes and cloud infrastructure. • Scaling: Experience with highly scalable, parallel compute workloads (e.g., Dask, Ray, Celery). • Performance: Experience with writing and optimizing highly performant code. • Cross-functional Affinity: Ability to work directly with our researchers and engineering stakeholders to translate their needs into data products with the desired user experience and performance. • Soft Skills: Excellent problem-solving abilities, strong communication skills, and a collaborative mindset. • Ideally, you have domain-specific experiences: • LLM or VLM training data preparation. • NLP, text classification, reinforcement learning, model-based/GPU workflows. • Dynamic workflow orchestration frameworks like Argo Workflows, Airflow, Dagster or Flyte. • Linguistics expertise or speaking multiple languages. • Experience in a high-performance programming language like C++, Go or Rust. • Tell us what you bring to the table and let us experience what you’re passionate for.

Benefits

• Diverse and internationally distributed team: joining our team means becoming part of a large, global community with people of more than 90 nationalities. We're more than just colleagues; we're a group of professionals with a shared mission to connect diverse cultures. Our global presence is growing–we've doubled in size nearly every year, with our employees based in the UK, Germany, the Netherlands, Poland, the US, and Japan, and we continue to expand our network. • Open communication, regular feedback: as a language-focused company, we value the importance of clear, honest communication. We value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and growth mindset makes us better together. • Hybrid work, flexible hours: we offer a hybrid work schedule, with team members coming into the office twice a week. This allows you to engage directly with your team and experience the unique energy of our workspace, while still enjoying the flexibility and comfort of working from home. With flexible working hours and trust in your productivity, we are in sync with your team’s general locations and time zones to foster effective and seamless collaboration. • Regular in-person team events: we bond over vibrant events that are as unique as our team, from local team and business unit gatherings, to new-joiner onboardings, to company-wide events that bring us all together–literally.

Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact·FAQ·Wagey on X