i3D.net - Senior Data Infrastructure Engineer – Build a (self-hosted) Greenfield Data Lake
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• The data platform is greenfield - you’ll have significant input into the final choices. As a starting point, we expect the stack to be self-hosted and open-source, in line with how i3D.net operates. Tools we’d expect you to evaluate and build on include: • Storage: MinIO (S3-compatible object storage), Apache Iceberg or Delta Lake • Processing: Apache Spark, Apache Flink • Orchestration: Apache Airflow • Query: Trino, ClickHouse • Data sources you’ll integrate with: MariaDB, Prometheus, OpenSearch, RabbitMQ, internal & external REST APIs • Infrastructure: Linux (Debian), Docker, Kubernetes, Ansible • What success looks like in the first year • You’ve designed and deployed the core Data Lake architecture on our own infrastructure • Data from at least the major source systems is flowing into the platform reliably • There’s a working transformation layer that makes raw data queryable and usable • Other teams can access and explore data without needing your help for every question • The platform is documented, monitored, and ready to support analytics use cases as they emerge • Data engineering depth: 6-8 years’ building and operating data pipelines, storage layers, and transformation frameworks in production environments. • Open-source data stack experience: Hands-on with tools like Spark, Airflow, Trino, or similar - ideally self-hosted rather than managed cloud equivalents. • Greenfield builder: You have experience building data infrastructure from scratch, not just maintaining existing platforms. • Strong SQL and data modeling: You can design schemas that balance analytical flexibility with performance. • Infrastructure-aware: Comfortable with Linux, containers, Kubernetes, and operating your own services. You don’t need a cloud console to get things done. • Pragmatic architect: You make sound technical decisions, document your reasoning, and know when “good enough” beats “perfect”. • Independent operator: You thrive with autonomy. You’ll be the first data engineer - you need to drive your own roadmap with input from your manager and stakeholders. • Collaborative mindset: You work well across teams to understand data sources and make them accessible. • Remote vs Onsite: This is a hybrid role, so you'll spend some time working onsite at our Rotterdam office. If you're already in the Netherlands, great. If not, we're happy to support your move with our relocation services (a valid EU work permit is required).
Responsibilities
• i3D.net runs infrastructure across more than 60 locations, serving millions of users, but today the data produced by our systems lives in silos. Infrastructure metrics, business transactions, application logs and financial data all exist, but none of them are aggregated, transformed or made available for decision making. • As our first Senior Data Systems Engineer, you’ll change that. You’ll design and build i3D.net’s Data Lake from scratch - a self-hosted, open-source-first data platform that brings all of this together into a single foundation. This is a greenfield build with real ownership: you’ll make the architectural decisions, lay out the groundwork, and create the data infrastructure the company will rely on for years to come. • Data Lake • You report to the Senior Engineering Manager. • Design the Data Lake architecture: Define the storage, ingestion, and transformation layers of systems for a self-hosted data platform, selecting the right open-source-first tools for each. • Design the Data Lake architecture: • Build data pipelines: Create reliable pipelines that ingest data from across the company - infrastructure metrics, business systems, application logs, and financial data. • Build data pipelines: • Model and transform data: Design schemas and transformation layers that make raw data usable, consistent, and queryable. • Model and transform data: • Integrate with existing systems: Connect to the company’s current data sources (MariaDB, Prometheus, OpenSearch, internal APIs, and others) without disrupting production workloads. • Integrate with existing systems: • Operate what you build: Own the reliability and performance of the data platform, including monitoring, alerting and capacity planning. Work closely with our Live Operations and Engineering teams to ensure it remains sustainable. • Operate what you build: • Collaborate across teams: Work with Platform, Infrastructure, Network, and Product teams to understand their data and make it accessible. • Collaborate across teams: • Document and share: Maintain clear documentation of the platform architecture, data catalog, and pipeline designs, so the foundation you build is understandable and extensible. • Document and share: • The data platform is greenfield - you’ll have significant input into the final choices. As a starting point, we expect the stack to be self-hosted and open-source, in line with how i3D.net operates. Tools we’d expect you to evaluate and build on include: • Storage: MinIO (S3-compatible object storage), Apache Iceberg or Delta Lake • Storage: • Processing: Apache Spark, Apache Flink • Processing: • Orchestration: Apache Airflow • Orchestration: • Query: Trino, ClickHouse • Query: • Data sources you’ll integrate with: MariaDB, Prometheus, OpenSearch, RabbitMQ, internal & external REST APIs • Data sources you’ll integrate with: • Infrastructure: Linux (Debian), Docker, Kubernetes, Ansible • Infrastructure: • What success looks like in the first year • You’ve designed and deployed the core Data Lake architecture on our own infrastructure • Data from at least the major source systems is flowing into the platform reliably • There’s a working transformation layer that makes raw data queryable and usable • Other teams can access and explore data without needing your help for every question • The platform is documented, monitored, and ready to support analytics use cases as they emerge
Benefits
• Build from zero: This is a rare greenfield opportunity to design an entire data platform from the ground up, on real hardware, at global scale. • Build from zero: • real hardware • Full ownership: You’ll make the architectural calls and see them through to production - no layers of approval or committee decisions. • Full ownership: • Infrastructure, not abstractions: Work with bare metal, your own data centers, and open-source tooling - not cloud dashboards. • Infrastructure, not abstractions: • Competitive Perks: Annual bonus, 25 vacation days (excluding national holidays), travel allowance, and a solid pension plan. • Career Growth: Access education reimbursement, career guidance, and opportunities to upskill. • Career Growth: • Stay Active: Free access to our in-house gym in Rotterdam. • Stay Active: • Free Games: Enjoy lifetime access to Ubisoft’s game library. • Free Games:
Similar Jobs
No credit card. Takes 10 seconds.