wagey.ggwagey.ggv1.0-e2c599d-4-May
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs/ML Engineer Role/Cantina - Member of Technical Staff, Data & ML Infrastructure for Video Models
Cantina

Cantina - Member of Technical Staff, Data & ML Infrastructure for Video Models

Unknown - USA *$200k - $260k+ Equity3w ago
RemoteMidNACloud ComputingArtificial IntelligenceML EngineerData EngineerVideographerTraining DeliveryPythonAWSKubernetesQuality Control

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Requirements

• 3+ years of experience in machine learning, applied ML, data pipelines, or related engineering roles, ideally working on large-scale multimodal, video, or vision-based systems. • Strong programming skills in Python and solid experience building reliable data processing and preprocessing pipelines for ML workflows. • Hands-on experience preparing training data for ML models, including parsing, filtering, dataset curation, quality control, and large-scale data handling using tools such as AWS S3 and DynamoDB. • Familiarity with annotation and labeling workflows, including task design, vendor or crowd-platform orchestration such as MTurk or Prolific, and methods for ensuring label quality. • Experience working with Kubernetes for orchestrating distributed workloads, including data preprocessing, pipeline execution, and dataset delivery to training clusters. • Comfort working across cloud and on-demand compute environments such as AWS and RunPod, with the ability to port and optimize pipelines across infrastructure. • Familiarity with distributed data processing frameworks and experience designing systems that operate reliably at scale across many nodes or workers. • Working knowledge of PyTorch and the broader deep learning stack, with the ability to read, debug, and optimize research model inference code for use in production preprocessing pipelines. • Ability to work cross-functionally with research and engineering teams and translate experimental ideas into robust, scalable systems. • Bachelor's, Master's, or PhD in Computer Science, Machine Learning, Engineering, Mathematics, or a related technical field; experience in generative video, computer vision, or multimodal ML is strongly preferred. • Bonus: Experience training, evaluating, or fine-tuning smaller ML models used for classification, filtering, ranking, quality assessment, or other supporting tasks in an ML pipeline.

Responsibilities

• Build and maintain data pipelines for large video generation models, including data ingestion, parsing, filtering, preprocessing, and dataset curation at scale, using tools such as AWS S3 and DynamoDB. • Design and run annotation workflows across platforms such as MTurk, Prolific, and Mechanical Turk, including task design, quality control, and label validation. • Train, evaluate, and improve smaller supporting models used for data filtering, quality assessment, preprocessing, or other parts of the ML pipeline. • Partner closely with research and engineering teams to turn experimental workflows into scalable, repeatable systems that support model training and evaluation. • Own data quality across the pipeline by identifying bottlenecks, failure modes, and low-quality sources, and continuously improving tooling and processes. • Build internal tools and automation that make it easier to prepare datasets, launch annotation jobs, monitor outputs, and support model development end to end. • Drive larger pipeline projects from start to finish, such as new dataset creation efforts or upgrades to labeling and preprocessing infrastructure. • Work within a Kubernetes-based training infrastructure, ensuring datasets are properly prepared, formatted, and delivered to training clusters. • Profile and optimize research model inference scripts used in preprocessing steps, ensuring that model-driven filtering and transformation stages run within practical time and cost constraints when applied to large-scale raw data.

Benefits

• The anticipated annual base salary range for this role is between $200,000-$260,000 (€170,000-€225,000). When determining compensation, a number of factors will be considered, including skills, experience, job scope, location, and competitive compensation market data. • Competitive salary and generous company equity • Medical, dental, and vision insurance – 99.99% of premiums covered by Cantina • 42 days of paid time off, including: • 15 company holidays • 2 floating holidays • Generous parental leave & fertility support • 401(k) retirement savings plan • Lifestyle spending account – $500/month to use however you’d like • Complimentary lunch and snacks for in-office employees • One Medical membership, and more!

Similar Jobs

WhoopWhoop - Technical Lead, ML Operations2d ago
·Boston, MA - Hybrid·$150k - $215k/year
In OfficeNAStaffCloud ComputingArtificial IntelligenceTech LeadML EngineerJavaPythonTeam ManagementRecords ManagementAWS
Multibank GroupMultibank Group - AI & ML Engineer3d ago
·Bangalore, Karnataka, India
In OfficeAPACCloud ComputingArtificial IntelligenceML EngineerAI EngineerPythonscikit-learnXGBoostLearning & DevelopmentHugging Face
Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact·FAQ·Wagey on X