TRG Screen - Data Architect
Requirements
• 8–10+ years in data / information architecture at a multi-product SaaS company. • Deep expertise in conceptual / logical / physical data modelling, master data management, and domain-oriented governance. • Proven track record reconciling semantically divergent source systems into canonical models across multiple products. • Strong grounding in Lakehouse architecture trade-offs: table formats, query engines, ingestion patterns (Iceberg vs Delta; Dremio vs Trino vs Snowflake; streaming vs batch). • Dremio, Snowflake, or Databricks • AWS data services (S3, Glue, Athena, Lake Formation, or equivalent) • NATS / Apache Kafka or similar, and CDC patterns (Debezium or equivalent) • Stream processing (Apache Flink, Spark Structured Streaming, or equivalent) • SQL and query optimisation at scale • Data and table formats (Parquet, Avro, Apache Iceberg or Delta Lake) • Experience designing SaaS data models, tenant isolation, shared schemas, access patterns, at scale. • Ability to communicate architecture decisions to engineers, product leadership, and executives. • Strongly preferred • Market data, financial services, or enterprise software-licensing domain experience. • Familiarity with modern data catalog and lineage tools, and the operational reality of running them at scale (DataHub, OpenMetadata, Atlan, Glue Catalog or equivalent). • Experience designing data products that feed AI and agentic use cases, such as, vector stores, RAG, embeddings, tool surfaces. • Python for data pipeline scripting and orchestration. • Exposure to real-time analytics use cases - dashboards, alerting, and anomaly detection.
Responsibilities
• Data architecture • Author the canonical data model across the ten-product estate: logical and physical modelling, entity relationships, SCDs, and tenant isolation patterns. • Define data contracts with product teams; own the data dictionary, naming standards, and reference-data discipline. • Lead master data management - entity resolution, golden records, and cross-product joins at scale • Define data domains, ownership boundaries, and federated governance suited to a product-aligned organization. • Design the semantic/query layer (Dremio) for analysts, products, customers, and AI agents; set governance standards covering cataloguing, lineage, quality, access control, and PII. • Lakehouse architecture • Define the Lakehouse target architecture: storage, table format, query engine, ingestion, metadata, and the standards that govern it. • Set the patterns for multi-tenant isolation, row- and column-level security, and PII handling. • Define the streaming and batch ingestion architecture, building on our investments in NATS JetStream, Apache Flink, and Debezium CDC patterns, and direct the engineering team on execution. • Specify the catalog, lineage, and data-quality foundation and own the operational practice that keeps them honest. • Make the build-vs-buy and tool selection calls, including revisiting current technology bets where the data warrants it. • Strategy and partnership • Partner with the Director of Data Solutions to ensure the architecture delivers against the product roadmap. • Represent data architecture in design reviews and customer-facing conversations. • Establish the architecture practice: patterns, review forums, hiring standard, as the team continues to expand.
Apply in one click
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT