socure - Senior Data Scientist - International eKYC, Identity Graph
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• Master’s or Ph.D. in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field, or equivalent practical experience. • 6+ years of hands-on applied ML / data science experience (4+ with Ph.D.), including owning production models and pipelines in high‑stakes domains (fraud, risk, identity, payments, credit, or similar). • Significant prior work on international or multi‑region products is strongly preferred (e.g., cross‑country KYC, credit risk, payments, or compliance systems). • Expert‑level proficiency in Python and SQL, with extensive experience in distributed data processing (Spark/PySpark, Databricks or similar) on very large datasets. • Deep experience designing, training, and deploying models for classification, ranking, anomaly detection, and/or graph learning, including: • Feature engineering for noisy/heterogeneous identity data. • Robust evaluation under label sparsity and feedback delays. • Calibration and thresholding tailored to regional risk and regulatory constraints. • Proven expertise with graph technologies (e.g., Neo4j, AWS Neptune, GraphFrames, DGL, PyTorch Geometric) and graph algorithms (entity resolution, link prediction, community detection, label propagation) at scale. • Please note that sponsorship is not available at this time; and that you must be located within 45 miles of a talent hub to be considered.
Responsibilities
• International eKYC Modeling & Entity Resolution • Lead the design, development, and deployment of ML and graph-based algorithms for international entity resolution, identity trust scoring, and anomaly detection across heterogeneous, country‑specific datasets. • Architect reusable matching and linking frameworks that work across multiple ID schemes (e.g., national ID numbers, passports, voter IDs, mobile accounts, bank accounts) and local name/address conventions. • Develop probabilistic and rule‑augmented models that handle noisy, sparse, or partially labeled international data while maintaining explainability and regulatory defensibility. • Global Identity Graph & Data Quality • Define and evolve the international extension of Socure’s identity graph: schema design, linkage strategies, quality tiers, and confidence scoring that can be leveraged by multiple products (Verify, KYC, watchlists, fraud). • Design and implement robust data quality and monitoring frameworks for international identity data (coverage, stability, drift, regional bias, label quality) and integrate them into modeling and production monitoring workflows. • Build scalable approaches for handling linguistic and cultural variation (e.g., transliteration, multi‑script names, address normalization, local naming patterns) in the identity graph and matching pipelines. • Evaluation, Experimentation, and Model Governance • Own experimentation strategy for major international eKYC initiatives: • Design offline evaluations and online A/B tests that reflect local ground truth constraints and data sparsity. • Define success metrics that balance approval rates, fraud capture, and regulatory/operational constraints per market. • Analyze lift, stability, and fairness trade‑offs and drive go/no‑go decisions with Product and Engineering. • Define and maintain evaluation frameworks specific to international eKYC (e.g., regional coverage maps, cross‑border identity leakage, local demographic impact, regulatory thresholds). • Contribute to model governance documentation and support responses to regulators and large enterprise customers regarding model logic, data provenance, fairness, and monitoring for international markets. • Data Source Strategy & Vendor Evaluation (International) • Lead the evaluation and integration of international data vendors (e.g., bureaus, telcos, public records, alternative data): • Design benchmarking methodologies for signal quality, incremental value, stability, and fairness by country/segment. • Quantify ROI and trade‑offs across multiple vendors and data types; provide clear recommendations that influence product and commercial decisions. • Partner with Data Acquisition, Legal, and Compliance to ensure that data usage and modeling approaches meet regional regulatory requirements (e.g., GDPR and local privacy/AML/KYC rules). • Technical Leadership & Cross‑Functional Partnership • Collaborate with engineering leaders to design scalable, reliable international data and model pipelines using Spark/PySpark, AWS (EMR, S3, SageMaker, Neptune), and modern MLOps workflows. • Act as a subject‑matter expert on international identity, eKYC regulations, and cross‑border data limitations for internal stakeholders, supporting complex customer questions and strategic roadmap discussions. • Mentor Data Scientists and Senior Data Scientists on best practices for international modeling: handling low‑label regimes, domain adaptation, localization of thresholds/logic, and building reusable abstractions instead of one‑off country fixes. • Communicate strategy, progress, and results to senior leadership and cross‑functional partners through clear documents and presentations, framing complex technical work in terms of business impact, regional risk, and regulatory trade‑offs.
Benefits
• DS2:$140K – $170K • Offers Equity • Offers Bonus • This is a base salary range for this job based on the job requirements. • Base pay is only one component of Socure's compensation and our total rewards package includes equity, benefits, and an annual bonus or a commission plan. • annual bonus • commission plan. • Upload your resume here to autofill key application fields. • Drop your resume here! • Parsing your resume. Autofilling key fields... • Please note: we have set up limits for applications for this role. • Candidates may not apply more than 3 times in any 30 day span for any job at Socure. • Candidates may not re-apply to the same role within 30 days. • or drag and drop here • Mark No: Candidates on F1, OPT, or H1 visas that will require sponsorship now or in the future. • We really read these! Please be brief & compelling :) • Unfortunately we are unable to hire employees residing in these states. • You may have to go into office 2-3 times a week. • San Francisco, CA • New York City, NY • Socure's Recruiting Privacy Policy • Decline to self-identify • Hispanic or Latino - A person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin regardless of race. • Hispanic or Latino • White (Not Hispanic or Latino) - A person having origins in any of the original peoples of Europe, the Middle East, or North Africa. • White • Black or African American (Not Hispanic or Latino) - A person having origins in any of the black racial groups of Africa. • Black or African American • Native Hawaiian or Other Pacific Islander (Not Hispanic or Latino) - A person having origins in any of the peoples of Hawaii, Guam, Samoa, or other Pacific Islands. • Native Hawaiian or Other Pacific Islander • Asian (Not Hispanic or Latino) - A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian Subcontinent, including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam. • Asian • American Indian or Alaska Native (Not Hispanic or Latino) - A person having origins in any of the original peoples of North and South America (including Central America), and who maintain tribal affiliation or community attachment. • American Indian or Alaska Native • Two or More Races (Not Hispanic or Latino) - All persons who identify with more than one of the above five races. • Two or More Races • Hispanic or Latino • White (Not Hispanic or Latino) • Black or African American (Not Hispanic or Latino) • Native Hawaiian or Other Pacific Islander (Not Hispanic or Latino) • Asian (Not Hispanic or Latino) • American Indian or Alaska Native (Not Hispanic or Latino) • Two or More Races (Not Hispanic or Latino) • I identify as one or more of the classifications of protected veteran listed above • I am not a protected veteran
No credit card. Takes 10 seconds.