MLOps Engineer
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• Bachelor's degree in computer science, machine learning, or a related field • 1+ year of experience working with large datasets in a production environment or academic setting • Strong command of Python fundamentals and data wrangling (pandas, scikit-learn, matplotlib) • Basic experience with batch data pipelines and ETL workflows • Familiarity with cloud object storage (AWS S3 or equivalent) and structured data organization • Basic understanding of structured data organization and common associated issues • Ability to follow structured workflows and deliver reproducible results • Attention to detail and strong ownership of data quality • Basic experience working with cloud services • Master’s degree in computer science, machine learning, or a related field • Exposure to vision or audio data processing techniques • Experience with data lake technologies or distributed processing systems • Familiarity with Docker or containerized batch jobs • Understanding of dataset versioning and development vs training data separation • Experience with ML-related data pipelines or training workflows • Experience with our python data processing tech stack is ideal but not required • uv for project and dependency management • polars for dataframe workflows • DynamoDB and PostgreSQL for live data management • pydantic and pyright for data typing
Responsibilities
• Work with ML engineers and researchers to ensure delivered datasets are usable and correctly scoped • Build and run batch data generation and preprocessing jobs for image, video, and audio data • Execute preprocessing pipelines using shared batch orchestration tools • Design and run ETL jobs to ingest, transform, and organize data in our warehouse. • Validate input and output datasets (schema, metadata, basic quality checks) • Collect, organize, and deliver processed datasets using established conventions • Support creation of development and prototype datasets ahead of large-scale backfills • Maintain version control of data processing repositories following industry best practices • Debug data pipeline failures, ETL issues, and data quality problems
Benefits
• Reality Defender offers the following benefits to all our employees, regardless of location: • Healthcare plans with 100% premium coverage for employees and partial coverage available for dependents • Dental and Vision plans with 100% premium coverage for employees and their dependents • Short/Long-term disability and life insurance plans with 100% premium coverage for employees • FSA/HSA and 401k programs • 20 days of PTO per year • 12 weeks of Parental Leave • Learning and Development budget • Annual company-sponsored offsite • For employees working from Reality Defender’s HQ in NYC, we offer the following benefits: • Daily in-office lunch through UberEats • Remote Fridays • Happy Hours and other local events