Multibank Group - AI DevOps / Cloud Engineer (MLOps)
Requirements
• 5 to 10 or more years of experience in DevOps, Cloud Engineering, or Site Reliability Engineering • Proven experience building cloud infrastructure from scratch, not solely maintaining existing environments • Expert-level AWS skills are required and non-negotiable • Hands-on experience with Kubernetes in production environments • Experience with Infrastructure as Code using Terraform or CloudFormation • Comfortable being the sole infrastructure owner on a team initially, with a strong ownership mindset • Able to work directly with data scientists and ML engineers and translate their needs into infrastructure • Experience in high-growth startups, scale-ups, or product companies is preferred • Background in AI/ML infrastructure or MLOps is a strong advantage • Azure familiarity is a plus • Cloud and Infrastructure: AWS (EC2, EKS, S3, RDS, Lambda, SageMaker, IAM, VPC, CloudWatch), Terraform, CloudFormation; Azure familiarity is a plus • DevOps and CI/CD: GitHub Actions, GitLab CI, Git, GitOps principles, automated testing and deployment frameworks • Containerisation and Orchestration: Docker, Kubernetes (EKS), Helm charts • MLOps: MLflow, Weights and Biases, Airflow, Dagster, Prefect, Evidently AI, Arize, AWS SageMaker • Monitoring and Observability: Datadog, Prometheus, Grafana, CloudWatch, ELK Stack, PagerDuty • Security: IAM, Secrets Manager, KMS, VPN, zero-trust principles, encryption and access controls • Data and Integration: Databricks, Spark, PySpark, Delta Lake, Apache Iceberg, S3-based Lakehouse, Metabase, OpenMetadata, Segment, Amplitude, Adjust, Firebase, MoEngage, JourneyFi, Kafka, RabbitMQ
Responsibilities
• Design, build, and own the full cloud infrastructure for the AI and Data function on AWS, including VPCs, IAM, networking, security groups, compute, storage, and cost management. Ensure the foundation is solid before any model goes near production • Build and maintain CI/CD pipelines for data engineers, ML engineers, and data scientists using GitHub Actions or GitLab CI, implementing GitOps principles and automated deployment workflows. Every release should be automated, auditable, and repeatable • Own general technology operations for the AI team in the absence of a dedicated TechOps function, including environment setup, access management, developer tooling, system monitoring, incident response, and vendor and license management for all tools used by the team • Own Docker and Kubernetes across all AI workloads, building scalable, reliable container environments for model training, batch processing, and real-time inference. Manage cluster health, resource allocation, and cost efficiency • Work closely with ML engineers to deploy models into production, building and maintaining model serving infrastructure, inference endpoints, and batch scoring pipelines. Own the deployment side of the ML lifecycle including packaging, versioning, rollout, and rollback strategies • Implement end-to-end observability across infrastructure, application performance, and ML model health, including alerting, dashboards, and on-call processes • Set up and manage workflow orchestration tools such as Airflow, Dagster, or Prefect, ensuring pipelines are reliable, retryable, and observable • Evolve the MLOps practice over time, implementing drift detection, data quality checks, performance tracking, experiment tracking, and automated retraining triggers • Manage integrations from CDPs and product analytics platforms including Segment and Amplitude, and mobile attribution and engagement tools including Adjust, Firebase, MoEngage, and JourneyFi into the central AWS data infrastructure • Support deployment, access management, and integration of metadata management and BI tools including OpenMetadata and Metabase within the cloud environment • Ensure all cloud infrastructure and AI systems meet security and compliance standards, including secrets management, encryption, network security, and access controls • Maintain clear, up-to-date documentation for all infrastructure, deployment processes, and operational runbooks. Set engineering standards that the AI team follows as it grows
Benefits
• Work with one of the world’s leading financial derivatives institutions. • Competitive salary plus performance-based incentives. • Access to a dynamic, international, and fast-growing environment. • Strong opportunities for career progression within a global financial group. • Be part of a business committed to innovation, excellence, and long-term growth. • Become part of our international community at MultiBank Group, dedicated to excellence, innovation, and shaping the future of finance.
Apply in one click
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT