wagey.ggwagey.gg
38,923  jobs38,923  jobs
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs(38,923)/Data Engineer Role(628)/Neurons Lab (2) - Data Engineer
Pro members applied to this job 36 hours before you saw itGet Pro ›
Neurons Lab

Neurons Lab - Data Engineer

Remote - Latvia, Lithuania, Spain...6d ago
RemoteMidEMEACloud ComputingData AnalyticsData EngineerData ScientistAWSSQLPythondbtDocumentationReportingRedshiftGreat ExpectationsAirflow

Requirements

• Strong SQL and Python for large-scale data processing • Python • AWS data stack: S3, Glue, Lake Formation, Athena / Redshift, EMR / Spark, Step Functions / Airflow • AWS data stack • Data modeling & semantic layer (dbt or equivalent); dimensional modeling • Data modeling & semantic layer • Entity resolution / record linkage across heterogeneous sources • Entity resolution / record linkage • Data-quality & testing frameworks (Great Expectations, dbt tests) and data lineage • Data-quality & testing • Anonymization / pseudonymization techniques and their analytical trade-offs • Anonymization / pseudonymization • Big-data processing (Spark) with performance and cost optimization at scale • Clear written / verbal English; documents for handover and works well with a distributed team • Knowledge • GDPR fundamentals as applied to anonymized / pseudonymized financial data and UK / EU data residency • AWS Well-Architected (Analytics, Security) for BFSI • AWS Well-Architected • Awareness of credit / risk data structures and what downstream modeling consumers need — a plus • 4+ years in data engineering, with strong AWS + Spark / SQL at scale • 4+ years • AWS + Spark / SQL at scale • Demonstrated experience harmonizing / integrating data across multiple source systems • harmonizing / integrating data across multiple source systems • Experience building validated, reproducible pipelines in a regulated environment (BFSI, healthcare, government) — strong plus • validated, reproducible pipelines in a regulated environment • Comfortable stepping into a messy, partly-built data estate and bringing it up to standard • messy, partly-built data estate • Comfortable as the sole or lead data engineer on a small (3–4 person) delivery pod

Responsibilities

• Reproduce a descriptive-statistics report end-to-end so any figure traces back to raw source — closing the gap the client admitted (numbers they can't currently defend). • Reproduce a descriptive-statistics report end-to-end • Profile and reconcile differing source schemas across acquired entities: map differing field names, types, encodings and business definitions for the same concept into one conformed model. • reconcile differing source schemas • Build dbt staging → intermediate → mart models with tests; codify the harmonized definitions the Data Science Lead specifies. • dbt staging → intermediate → mart models • Write Great Expectations suites (null / range / uniqueness / referential checks) and wire them into the pipeline so bad data fails loudly rather than silently corrupting analysis. • Great Expectations suites • Implement entity / identity resolution (deterministic + fuzzy matching) where there is no clean shared key for the same customer or account across sources. • entity / identity resolution • Implement and verify anonymization / pseudonymization (hashing / tokenization / k-anonymity) and evidence that re-identification risk is controlled for the client's IT / compliance team. • verify anonymization / pseudonymization • Optimize Spark / Glue jobs over tens of millions of rows — partitioning, file formats (Parquet), incremental loads, cost control. • Optimize Spark / Glue jobs over tens of millions of rows • Orchestrate with Airflow / Step Functions; build repeatable, scheduled pipelines rather than one-off scripts. • Airflow / Step Functions • Prepare clean, documented, feature-ready datasets for the PD / delinquency models. • clean, documented, feature-ready datasets • Document runbooks so the offshore team can operate the pipelines and handover takes days, not weeks; help scope onboarding of the remaining (Ireland + additional) sources. • runbooks

Apply in one click

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Similar roles

Neurons LabNeurons Lab - Data Science Lead6d ago
·Warsaw, Poland
In OfficeEMEAStaffCloud ComputingData AnalyticsData EngineerData ScientistPythonscikit-learnSQLPolarsClient ConsultingdbtPandasAWSFinancial ModelingDocumentationGovernance
SambaSamba - Data Scientist2mo ago
·Warsaw
In OfficeEMEAMidCloud ComputingData AnalyticsArtificial IntelligenceData ScientistCAODocumentationPythonSQLAWSGCP
aboundabound - Data Engineer2mo ago
·London, United Kingdom, Hybrid·Equity
In OfficeEMEASeniorCloud ComputingData AnalyticsData EngineerAWSSQLGovernanceReporting
EnodeEnode - Data Engineer4mo ago
·Remote - Europe
RemoteEMEARenewable EnergyArtificial IntelligenceData AnalyticsData EngineerData ScientistReportingTeam LeadershipSnowflakedbtAirbyte
Obsidian SecurityObsidian Security - Data Engineer1w ago
·Cheltenham, UK
In OfficeEMEAMidCloud ComputingArtificial IntelligenceData EngineerPythonDocumentationSQLAirflowDagsterdbtGitData QualityDatabricksKafkaGoDockerKubernetesAWSGCPPrometheusGrafana
Lupa PetsLupa Pets - Data Engineer6d ago
·London, England, United Kingdom
In OfficeEMEACloud ComputingData AnalyticsData EngineerPythonApache SparkAWSDockerAirflowScaladbtData Quality
lendablelendable - Lead Data Scientist - Recommendation System4w ago
·London, United Kingdom, Hybrid
In OfficeEMEAStaffFintechData AnalyticsData ScientistdbtSQLPythonReportingData Analysis
ClaspClasp - Data Engineer1mo ago
·Remote - ET (Eastern)·$100k - $125k/year + Equity
RemoteNAMidCloud ComputingData AnalyticsData EngineerReportingAirflowdbtPythonPostgreSQL
SOUMSOUM - Data Engineer1mo ago
·Remote - / Cairo / Tashkent
RemoteEMEAMidCloud ComputingArtificial IntelligenceData EngineerSQLPythonAWSAzureGCP

Browse more by category

Show 628 moreData EngineerShow 353 moreData ScientistShow 3,747 moreAWSShow 3,453 moreSQLShow 6,205 morePythonShow 450 moredbtShow 5,632 moreDocumentationShow 8,372 moreReportingShow 139 moreRedshiftShow 10 moreGreat Expectations
Privacy·Terms··Contact·FAQ·Wagey on X