wagey.ggwagey.ggv1.0-0f5e85e-22-May
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs/Intern Role/luminovo - Data Quality Intern (d/f/m)
Pro members applied to this job 36 hours before you saw itGet Pro ›
luminovo

luminovo - Data Quality Intern (d/f/m)

Remote - 🇪🇺 Europe (remote)2d ago
RemoteInternEMEAManufacturingInternData Science InternRustSQLCoachingReportingPythonPostgreSQLTypeScriptClaudehypothesisData Quality

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Requirements

• Discovery instinct and intellectual honesty are what set this role apart, sitting on top of an analytics baseline. Deep technical and domain depth is coachable and AI-assisted. Calibration and judgment are not. • You can pull a trustworthy number out of messy data with SQL • You own the question, not just the query, and reframe it toward what actually matters • You sanity-check your own work and say "not proven yet" when that's the truth • You communicate findings clearly enough for a non-analyst to act on • You use AI with real verification, and write small Python scripts to fix data safely • You bring high agency, learn fast, and don't drop threads • Bonus: you read code with AI help, and you're curious about the electronics domain • We always try to use the best tool for the job. Don't worry, we don't need you to be familiar with all of these: • ClickHouse as our data warehouse and PostgreSQL for our production data, the two main places you'll query. • Python for scripting, data transformation, and safe bulk fixes. • Rust and TypeScript in our main product codebase, which you'll read (with AI help) to verify behavior. • External data sources such as SiliconExpert and DigiKey, reached through their APIs. • AI tooling (such as Claude Code) for code navigation, querying, and scripting. • 🤓 WHOM YOU’LL BE WORKING WITH • You'll report to and take direction from Mike https://www.linkedin.com/in/mfoetsch/, who owns the data quality mission and sets the questions you'll investigate. • You'll be working with other Luminerds like Shamir https://www.linkedin.com/in/shamir-khodzha-a1940514a/, Igor https://www.linkedin.com/in/rzhin/?locale=en, and Tiko https://www.linkedin.com/in/tinatini-surmava/.

Responsibilities

• Our data quality mission is product discovery applied to our part and component data. You take a fuzzy quality problem, figure out what it actually means for customers, measure it honestly, and hand a well-scoped, evidence-backed finding to the team that delivers the larger fix. • The hard part isn't running a query (our AI tooling helps with that). It's reframing "x% of parts have no pin count" into "y% of a customer's costings can't complete because of it," then giving other teams a result they can act on without them having to re-check it. • You'll be a junior version of this discovery loop: sharp, honest, and data-fluent. You'll follow threads the team doesn't have time to chase, turn them into decision-ready findings, and grow into more autonomy across your internship. You work within a clear direction, and you can take a fix all the way into production when it's a data-level change you can script, like manufacturer merges or backfills. You won't need to be a Rust engineer or own large refactors. AI tooling does the heavy lifting on unfamiliar code and scripting. Your judgment and rigor are what matter most. • This role is an internship with a duration of three to six months. • 🎯 YOUR PERFORMANCE OBJECTIVES • Turn ambiguous data-quality questions into customer-relevant findings by reframing part-level observations into business/customer impact (e.g. tenant-aware "what actually blocks costing"), defining a sensible metric or proxy, and producing a measured, caveated answer to the question set by the product manager. • Independently size problems and test hypotheses against our data by writing read-only queries over the data warehouse (ClickHouse) and production Postgres, and producing numbers you can defend (knowing when a result is double-counted, misleading, or too good to be true) • Make the effect of fixes and experiments visible by extending our dashboards and building ad-hoc visualizations that show trends, baselines, and whether an intervention actually moved coverage/correctness. • Run small experiments to gather evidence by writing scripts (with AI assistance) against external sources such as SiliconExpert and DigiKey, e.g., to check whether a missing-data gap is fetchable, calibrate a finding, or do spot checks on interesting cases. • Verify assumptions in the product itself by navigating the epibator (Rust/TS) codebase with AI tooling to confirm how data is actually resolved/used, and occasionally adding light instrumentation we find we need, without owning large refactors. • Apply the fixes you've scoped, safely by writing AI-assisted scripts that correct production customer data at scale: e.g. automating the research to decide whether two manufacturers are the same record and then executing thousands of merges. Make every change safe by construction: dry-run and validate against samples first, work in reversible/checkpointed batches, and put guardrails in place so we never introduce regressions or corrupt manufacturing/costing data. • Leave behind durable, trustworthy knowledge by following the mission's loop (brief, investigate, report, distill), citing evidence, dating facts, and writing findings other teams and stakeholders can act on without re-deriving them. • Be your own harshest critic by reconciling and sanity-checking your own results, clearly separating "what's proven" from "what's still a hypothesis," and flagging loudly when a finding overturns a prior assumption (incl. your own).

Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact·FAQ·Wagey on X