Comply - Senior AI Data Engineer

York, England1mo ago

In Office Senior EMEA Fintech Cloud Computing Data Engineer Senior Data Engineer Vector Neo4j pgvector Weaviate Qdrant

Requirements

• Strong hands-on experience in data engineering, with a focus on semantic or AI data infrastructure • Experience building and operating knowledge graphs or graph databases (e.g. Jena Fuseki, Neo4j, Amazon • Neptune, or equivalent) • Experience with vector databases and embedding pipelines (e.g. Pinecone, Weaviate, Qdrant, pgvector) • Practical experience implementing RAG architectures or LLM-integrated data pipelines • Familiarity with semantic web standards — JSON-LD, RDF, OWL, or SKOS • Strong Python skills and experience with data pipeline frameworks • Experience with cloud-native data platforms (AWS, Azure, or GCP) • Exposure to domain-driven design (DDD) and bounded contexts is desirable. • Experience working directly with ontologists or knowledge engineers is a plus. • Familiarity with data contracts and data product frameworks is a plus. • Experience with DataOps tooling, data reliability, or data observability platforms is desirable. • Background in financial services, RegTech, or compliance data is a plus. • To learn more about our values, mission and the wide-range of perks offered to employees at Comply, visit https://www.comply.com/careers/.

Responsibilities

• Semantic Layer Implementation • Implement JSON-LD-based semantic models designed by the ontologist into production data systems • Build and maintain knowledge graph structures that reflect canonical domain models • Develop and manage graph database schemas, queries, and data ingestion pipelines • Ensure semantic consistency between ontology definitions and downstream data product • AI & Vector Infrastructure • Design and implement embedding pipelines that represent Comply’s financial and regulatory data in • Build and operate vector database infrastructure for semantic search and similarity retrieval • Implement RAG (Retrieval-Augmented Generation) architectures that ground LLM outputs in Comply’s • proprietary data • Evaluate and integrate LLM tooling and frameworks appropriate to Comply’s use cases • Data Pipeline & Platform Engineering • Build reliable, observable data pipelines that feed the semantic layer from upstream broker and • regulatory data sources • Work with Data Engineers and Backend Engineers to embed semantic models into APIs and data contracts • Ensure the semantic layer scales with data volume and platform growth • Collaboration & Enablement • Partner closely with the Ontologist to ensure implemented models faithfully reflect domain intent • Support consuming application teams in understanding and adopting AI-ready data products • Contribute to resolving cross-domain data integration challenges