Skills needed: Data engineering skills and experience with big data tools such as Hadoop/Spark; knowledge of SQL queries for complex joins across multiple tables is required. Familiarity with ETL processes using Python scripts preferred but not mandatory. Experience in designing scalable, performant databases that can handle large volumes of data efficiently desired.
Years of experience: 3+ years (preferably) or equivalent professional development and hands-on work experiences related to big data engineering/data science roles required.
Education: Bachelor's degree preferred but not mandatory; Master’in in Data Science, Computer Science, Engineering, Statistics, Mathematics, Information Systems is highly desirable for candidates with less experience (3+ years). A strong foundation and understanding of computer programming languages such as Python or R are required.
Certifications: None stated explicitly mentioned within the job posting text provided. However, certification in data engineering/data science from recognized institutions can enhance your application; we encourage you to consider obtaining relevant credentials if they align with your career goals and experience level.
Must-haves: Experience working remotely or as a distributed team is required due to the company's remote work policy for this role, which may involve collaborating across different time zones in Argentina and possibly globally depending on project needs.
Responsibilities
Architect, design, and build scalable, robust, and efficient data pipelines using Azure Data Factory and other cloud-native tools.
Establish best practices for data ingestion, transformation, storage, and access, optimizing for performance and cost-efficiency.
Design and implement complex data models, schemas, and partitioning strategies for analytical and operational workloads.
Champion modern data modeling strategies such as medallion architecture.
Define and enforce frameworks for data quality, lineage, and observability.
Enhance data platform infrastructure including CI/CD, monitoring, cost optimization, and failover strategies.
Optimize performance of SQL Server and PostgreSQL databases and cloud services by identifying and resolving bottlenecks.
Collaborate with product, analytics, and business teams to translate business goals into data architecture and pipeline solutions.
Serve as the technical subject matter expert for data platforms and initiatives.
Identify automation opportunities, streamline operations, and promote data democratization.
Maintain comprehensive technical documentation, architecture diagrams, and operational playbooks.
Ensure architectural consistency and enforce data security across all services and platforms.
Work with Architects to model analytical data models, data warehouses, pipelines, and ETL processes.