Pro members applied to this job 36 hours before you saw itGet Pro ›

luminovo - Data Quality Intern (d/f/m)

Remote - 🇪🇺 Europe (remote)2d ago

Remote Intern EMEA Manufacturing Intern Data Science Intern Rust SQL Coaching Reporting Python PostgreSQL TypeScript Claude hypothesis Data Quality

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• Discovery instinct and intellectual honesty are what set this role apart, sitting on top of an analytics baseline. Deep technical and domain depth is coachable and AI-assisted. Calibration and judgment are not. • You can pull a trustworthy number out of messy data with SQL • You own the question, not just the query, and reframe it toward what actually matters • You sanity-check your own work and say "not proven yet" when that's the truth • You communicate findings clearly enough for a non-analyst to act on • You use AI with real verification, and write small Python scripts to fix data safely • You bring high agency, learn fast, and don't drop threads • Bonus: you read code with AI help, and you're curious about the electronics domain • We always try to use the best tool for the job. Don't worry, we don't need you to be familiar with all of these: • ClickHouse as our data warehouse and PostgreSQL for our production data, the two main places you'll query. • Python for scripting, data transformation, and safe bulk fixes. • Rust and TypeScript in our main product codebase, which you'll read (with AI help) to verify behavior. • External data sources such as SiliconExpert and DigiKey, reached through their APIs. • AI tooling (such as Claude Code) for code navigation, querying, and scripting. • 🤓 WHOM YOU’LL BE WORKING WITH • You'll report to and take direction from Mike https://www.linkedin.com/in/mfoetsch/, who owns the data quality mission and sets the questions you'll investigate. • You'll be working with other Luminerds like Shamir https://www.linkedin.com/in/shamir-khodzha-a1940514a/, Igor https://www.linkedin.com/in/rzhin/?locale=en, and Tiko https://www.linkedin.com/in/tinatini-surmava/.

Responsibilities

• Our data quality mission is product discovery applied to our part and component data. You take a fuzzy quality problem, figure out what it actually means for customers, measure it honestly, and hand a well-scoped, evidence-backed finding to the team that delivers the larger fix. • The hard part isn't running a query (our AI tooling helps with that). It's reframing "x% of parts have no pin count" into "y% of a customer's costings can't complete because of it," then giving other teams a result they can act on without them having to re-check it. • You'll be a junior version of this discovery loop: sharp, honest, and data-fluent. You'll follow threads the team doesn't have time to chase, turn them into decision-ready findings, and grow into more autonomy across your internship. You work within a clear direction, and you can take a fix all the way into production when it's a data-level change you can script, like manufacturer merges or backfills. You won't need to be a Rust engineer or own large refactors. AI tooling does the heavy lifting on unfamiliar code and scripting. Your judgment and rigor are what matter most. • This role is an internship with a duration of three to six months. • 🎯 YOUR PERFORMANCE OBJECTIVES • Turn ambiguous data-quality questions into customer-relevant findings by reframing part-level observations into business/customer impact (e.g. tenant-aware "what actually blocks costing"), defining a sensible metric or proxy, and producing a measured, caveated answer to the question set by the product manager. • Independently size problems and test hypotheses against our data by writing read-only queries over the data warehouse (ClickHouse) and production Postgres, and producing numbers you can defend (knowing when a result is double-counted, misleading, or too good to be true) • Make the effect of fixes and experiments visible by extending our dashboards and building ad-hoc visualizations that show trends, baselines, and whether an intervention actually moved coverage/correctness. • Run small experiments to gather evidence by writing scripts (with AI assistance) against external sources such as SiliconExpert and DigiKey, e.g., to check whether a missing-data gap is fetchable, calibrate a finding, or do spot checks on interesting cases. • Verify assumptions in the product itself by navigating the epibator (Rust/TS) codebase with AI tooling to confirm how data is actually resolved/used, and occasionally adding light instrumentation we find we need, without owning large refactors. • Apply the fixes you've scoped, safely by writing AI-assisted scripts that correct production customer data at scale: e.g. automating the research to decide whether two manufacturers are the same record and then executing thousands of merges. Make every change safe by construction: dry-run and validate against samples first, work in reversible/checkpointed batches, and put guardrails in place so we never introduce regressions or corrupt manufacturing/costing data. • Leave behind durable, trustworthy knowledge by following the mission's loop (brief, investigate, report, distill), citing evidence, dating facts, and writing findings other teams and stakeholders can act on without re-deriving them. • Be your own harshest critic by reconciling and sanity-checking your own results, clearly separating "what's proven" from "what's still a hypothesis," and flagging loudly when a finding overturns a prior assumption (incl. your own).

Get Started Free

No credit card. Takes 10 seconds.

Requirements

Responsibilities