Bounteous - Databricks Solution Architect
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• 8+ years of data engineering experience, with 4+ years building production workloads on Databricks. • Deep expertise in Apache Spark (PySpark and Spark SQL) — including performance tuning, partitioning strategy, and the Catalyst/Photon execution model. • Strong hands-on experience with Delta Lake, Unity Catalog, Databricks Workflows, and Delta Live Tables. • Production experience on at least one major cloud (AWS, Azure, or GCP), including networking, IAM, storage (S3/ADLS/GCS), and compute primitives. • Proficiency in Python and SQL; comfort with Scala is a plus. • Experience designing medallion (bronze/silver/gold) architectures and dimensional models for analytics. • Strong CI/CD and DevOps practice: Git, Terraform, Databricks Asset Bundles or dbx, automated testing of data pipelines. • Track record of leading technical projects end-to-end and mentoring engineers. • Excellent written and verbal communication; able to drive alignment with both engineering and business stakeholders. • $102,000 - $133,000 a year • Individual pay is determined by many factors, including experience, relevant education or training, and organizational needs. The mid-range to maximum of the salary range is generally reserved for individuals who are highly experienced in the role. • We invite you to stay connected with us by subscribing to our monthly job openings alert here.
Responsibilities
• Promote and enforce awareness of key information security practices, including acceptable use of information assets, malware protection, and password security protocols • Identify, assess, and report security risks, focusing on how these risks impact the confidentiality, integrity, and availability of information assets • Understand and evaluate how data is stored, processed, or transmitted, ensuring compliance with data privacy and protection standards (GDPR, CCPA, etc.) • Ensure data protection measures are integrated throughout the information lifecycle to safeguard sensitive information • Architect and lead the implementation of an enterprise lakehouse on Databricks (Delta Lake, Unity Catalog, Photon, Workflows) across one or more major clouds (AWS, Azure, or GCP). • Design scalable batch and streaming data pipelines using PySpark, Spark SQL, Structured Streaming, and Delta Live Tables; establish patterns for ingestion from operational systems, event streams, and third-party APIs. • Define and enforce platform standards for data modeling (medallion architecture), CI/CD, code quality, testing, observability, and cost optimization. • Lead the governance strategy using Unity Catalog — fine-grained access control, data lineage, audit, and PII handling — in partnership with security and compliance. • Optimize Spark workloads for performance and cost: cluster sizing, Photon, autoscaling, file layout, Z-ordering, caching, and query tuning. • Partner with ML engineers and data scientists to operationalize models using MLflow, feature stores, and model serving on Databricks. • Own the cloud infrastructure footprint for the platform: networking, IAM, secrets, encryption, and Terraform/IaC for Databricks workspaces and supporting services. • Mentor a team of data engineers; lead architecture reviews, code reviews, and technical design sessions; raise the bar on engineering practices. • Engage with stakeholders across analytics, product, and finance to translate business needs into a roadmap for the data platform.
No credit card. Takes 10 seconds.