meridianlink - Data Scientist, AI Data Foundations
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• 4–7 years of experience in a data science, ML engineering, or applied data role, with a meaningful portion of that time spent building data assets that other people's models or applications consumed. • Hands-on experience designing and operating vector stores for RAG or semantic search, including embedding generation, chunking, indexing, and retrieval evaluation. • Experience building or operating a feature store (e.g., Databricks Feature Store, Feast, or a custom internal platform), including offline training and online serving patterns and point-in-time correctness. • Experience modeling and building graph data structures using Neo4j, TigerGraph, Azure Cosmos DB Gremlin, or similar graph databases — and writing graph queries to answer real questions. • Strong proficiency in Python (pandas, NumPy, scikit-learn, PySpark) and SQL; comfortable working day-to-day in Databricks notebooks and jobs. • Practical experience with embedding models and LLM tooling (e.g., Hugging Face transformers, OpenAI / Azure OpenAI APIs, LangChain or similar) in a production or near-production context. • Demonstrated data discovery skills: profiling messy real-world datasets, surfacing non-obvious patterns, validating findings statistically, and explaining them clearly. • Solid grounding in classical ML concepts — supervised vs. unsupervised learning, train/test discipline, leakage, evaluation metrics — even though you will not own model training day-to-day. • Strong written and verbal communication skills; able to write up findings for both technical and business audiences. • Experience working in a SaaS or FinTech environment, particularly with lending, deposit, credit, fraud, or KYC/AML data. • Experience with Databricks-native AI/ML tooling: Databricks Vector Search, Databricks Feature Store, MLflow, and Unity Catalog. • Familiarity with open-source vector databases such as pgvector, Pinecone, Weaviate, Chroma, or FAISS, and a clear point of view on when to use which. • Experience with Microsoft Azure data and AI services (Azure OpenAI, Azure AI Search, ADLS Gen2). • Experience evaluating RAG systems end-to-end (recall@k, faithfulness, answer quality, hallucination measurement). • Exposure to graph algorithms (community detection, link prediction, centrality) applied to real business problems. • Bachelor's or Master's degree in Computer Science, Statistics, Mathematics, Engineering, or a related quantitative field, or equivalent professional experience. • Our Data & AI Stack • Lakehouse: Azure Databricks, Delta Lake, Unity Catalog, PySpark, SQL • AI Data Foundations: Databricks Vector Search, Databricks Feature Store, MLflow • Vector & Graph (current and exploratory): pgvector, Pinecone, Weaviate, FAISS; Neo4j, TigerGraph, Azure Cosmos DB (Gremlin) • Cloud: Microsoft Azure (ADLS Gen2, Azure OpenAI, Azure AI Search, Event Hubs) • AI Models and Agents: Databricks, AWS Bedrock, Azure ML • Integration & Governance: Informatica Data Management Cloud (IDMC), Unity Catalog
Responsibilities
• Build and maintain vector stores for RAG: Design embedding pipelines, chunking strategies, indexing approaches, and refresh patterns for the vector stores powering retrieval-augmented generation across MeridianLink products. • Own the feature store: Design, build, and operate feature store assets used for model training and online/offline inference, including feature definitions, freshness SLAs, lineage, point-in-time correctness, and reuse across teams. • Design graph data structures: Build graph databases that model relationships between applicants, applications, products, lenders, decisions, and outcomes — and make them queryable for both AI use cases and analytical investigations. • Lead data discovery: Profile our lending, deposit, and behavioral datasets to identify hidden trends, segments, anomalies, and potential model drivers; turn findings into actionable hypotheses for product, risk, and growth teams. • Engineer for AI consumption: Build the curated, AI-ready datasets that downstream model builders, application engineers, and analysts rely on — with appropriate quality, documentation, and governance baked in. • Evaluate retrieval and feature quality: Define and run evaluation frameworks for RAG retrieval quality, feature drift, embedding quality, and graph completeness; iterate based on what the metrics tell you. • Partner with model builders: Work closely with ML engineers and applied scientists to make sure the data structures you build accelerate their work rather than slow it down. • Champion responsible data use: Partner with governance, security, and compliance to ensure that AI-facing data assets respect data classification, customer consent, and regulatory boundaries from day one. • Communicate findings: Translate discovery work into clear narratives — write-ups, notebooks, dashboards, and short presentations — that help non-technical stakeholders act on what the data is showing.
Benefits
• $114,593 – $195,400 • MeridianLink runs a comprehensive background check, credit check, and drug test as part of our offer process. • It is not typical for offers to be made at or near the top of the salary range. The actual salary will be determined based on experience and other job-related factors permitted by law including geographical location. • t is not typical for offers to be made at or near the top of the salary range. • Meridianlink offers: • Insurance coverage (medical, dental, vision, life, and disability) • Flexible paid time off • 401(k) plan with company match • All compensation and benefits are subject to the terms and conditions of the underlying plans or programs, as applicable and as may be amended, terminated, or superseded from time to time. • Upload your resume here to autofill key application fields. • Drop your resume here! • Parsing your resume. Autofilling key fields... • or drag and drop here • If above question is "Other" please add the source below. • I prefer not to answer • Black or African American • Hispanic, Latino, or Spanish origin • Indigenous Peoples, First Nations, Native American, or Alaska Native • Native Hawaiian or Other Pacific Islander • White / Caucasian • Middle Eastern or North African • Decline to self-identify • Hispanic or Latino - A person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin regardless of race. • Hispanic or Latino • White (Not Hispanic or Latino) - A person having origins in any of the original peoples of Europe, the Middle East, or North Africa. • White • Black or African American (Not Hispanic or Latino) - A person having origins in any of the black racial groups of Africa. • Native Hawaiian or Other Pacific Islander (Not Hispanic or Latino) - A person having origins in any of the peoples of Hawaii, Guam, Samoa, or other Pacific Islands. • Asian (Not Hispanic or Latino) - A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian Subcontinent, including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam. • Asian • American Indian or Alaska Native (Not Hispanic or Latino) - A person having origins in any of the original peoples of North and South America (including Central America), and who maintain tribal affiliation or community attachment. • American Indian or Alaska Native • Two or More Races (Not Hispanic or Latino) - All persons who identify with more than one of the above five races. • Two or More Races • Hispanic or Latino • White (Not Hispanic or Latino) • Black or African American (Not Hispanic or Latino) • Native Hawaiian or Other Pacific Islander (Not Hispanic or Latino) • Asian (Not Hispanic or Latino) • American Indian or Alaska Native (Not Hispanic or Latino) • Two or More Races (Not Hispanic or Latino) • I identify as one or more of the classifications of protected veteran listed above • I am not a protected veteran
No credit card. Takes 10 seconds.