Data Engineer (Contract)

AbleRemote, LatAm - Hybrid$68k – $68k+ Equity1mo ago

In Office Staff LATAM Software Cloud Computing Artificial Intelligence Data Engineer Advisor Apache Spark Client Consulting Scala Databricks Airflow

Upload My Resume

Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• 10+ years of data engineering experience with enterprise-scale systems • Expertise in Apache Spark and Delta Lake, including ACID transactions, time travel, Z-ordering, and compaction • Deep knowledge of Databricks (Jobs, Clusters, Workspaces, Delta Live Tables, Unity Catalog) • Experience building scalable ETL/ELT pipelines using tools like Airflow, Glue, Dataflow, or ADF • Advanced SQL for data modeling and transformation • Strong programming skills in Python (or Scala) • Hands-on experience with data formats such as Parquet, Avro, and JSON • Familiarity with schema evolution, versioning, and backfilling strategies • Working knowledge of at least one major cloud platform: • AWS (S3, Athena, Redshift, Glue Catalog, Step Functions) • GCP (BigQuery, Cloud Storage, Dataflow, Pub/Sub) – nice to have • Azure (Synapse, Data Factory, Azure Databricks) – nice to have • Experience designing data architectures with real-time or streaming data (Kafka, Kinesis) • Consulting or client-facing experience with strong communication and leadership skills • Experience with data mesh architectures and domain-driven data design • Knowledge of metadata management, data cataloging, and lineage tracking tools • Familiarity with healthcare standards (e.g., HL7, FHIR, DICOM) is a plus • Awareness of international data privacy regulations and compliant system design • Master's degree in Computer Science, Data Engineering, or related field • ML Ops experience or integrating machine learning models into data pipelines • Relevant certifications in cloud platforms or data engineering • This is a contract position. This position is 100% remote within LatAm. Strong verbal and written communication skills in English are a requirement. • This contract period is for 4-6 months, starting in August 2025. Candidates are expected to work 40 hours per week during this contract period, and be available during normal business hours as-needed on this project. • A contract extension is possible, pending our client partnership and individual performance. Those candidates interested in a future long term contract position would be considered an asset. • Able's Values • Able's Values • Put People First: We're caring, open, and encouraging. We respect the richness that we each bring into our work. • Put People First: • Imagine Better: We are optimistic in our outlook, as well as creative and proactive to deliver the highest quality. • Imagine Better: • Expect Excellence: We commit to each other to always strive to be our best. • Expect Excellence: • Simplify to Solve: We create better outcomes by reducing complexity. • Simplify to Solve: • We are all Builders: We are motivated and empowered to help build Able, and our partner's businesses. • We are all Builders: • One Able. Many Voices: Our unity is our strength. Our diversity is our energy. • One Able. Many Voices: • Let’s build together.

Responsibilities

• Strategic Architecture Leadership • Shape large-scale data architecture vision and roadmap across client engagements • Establish governance, security frameworks, and regulatory compliance standards • Lead strategy around platform selection, integration, and scaling • Guide organizations in adopting data lakehouse and federated data models • Client/Partner Value Creation • Lead technical discovery sessions to understand client needs • Translate complex architectures into clear, actionable value for stakeholders • Build trusted advisor relationships and guide strategic decisions • Align architecture recommendations with business growth and goals • Technical Architecture & Implementation • Design and implement modern data lakehouse architectures with Delta Lake and Databricks • Build and manage ETL/ELT pipelines at scale using Spark (PySpark preferred) • Leverage Delta Live Tables, Unity Catalog, and schema evolution features • Optimize storage and queries on cloud object storage (e.g., AWS S3, Azure Data Lake) • Integrate with cloud-native services like AWS Glue, GCP Dataflow, and Azure Synapse Analytics • Implement data quality monitoring, lineage tracking, and schema versioning • Build scalable pipelines with tools like Apache Airflow, Step Functions, and Cloud Composer • Business Impact & Solution Design • Develop cost-optimized, scalable, and compliant data solutions • Design POCs and pilots to validate technical approaches • Translate business requirements into production-ready data systems • Define and track success metrics for platform and pipeline initiatives