pragmatike - ML Ops Engineer (EMEA Remote)

Remote - Albania, Armenia, Bosnia & Herzegovina...2w ago

Remote Mid EMEA Cloud Computing Artificial Intelligence ML Engineer Helm Python Terraform Triton MLOps

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• Industry: Cloud Computing / AI / European Deep-Tech SaaS • 4+ years of experience in ML Ops, Platform Engineering, SRE, or similar infrastructure roles focused on ML systems • Hands-on experience with model serving frameworks such as vLLM, TGI, Triton, or equivalent • Strong background in container orchestration and operating GPU-based workloads in production • Experience with MLOps tooling including model registries, experiment tracking, and automated deployment pipelines • Proficiency in Python and infrastructure-as-code tools (e.g., Terraform, Helm, or similar) • Strong understanding of distributed systems, performance tuning, and production reliability engineering • Ability to effectively use AI coding assistants to accelerate development and debugging workflows • Ownership mindset with the ability to operate independently in a remote-first environment • Experience with ML platforms such as Kubeflow, MLflow, or KubeAI • Knowledge of GPU scheduling, CUDA/ROCm optimization, or multi-tenant inference systems • Experience with cost optimization across different GPU types and inference workloads • Background in early-stage startups or greenfield infrastructure projects • Proven experience building production systems from scratch rather than maintaining legacy platforms

Responsibilities

• Build and operate production-grade model serving infrastructure using frameworks such as vLLM, TGI, Triton, or equivalent • Design and implement robust deployment pipelines with blue/green and canary rollout strategies for ML models • Develop and maintain auto-scaling systems, multi-model serving architectures, and intelligent request routing layers • Optimize GPU utilization, memory efficiency, network throughput, and model artifact storage performance • Design observability systems for tracking inference latency, throughput, GPU usage, cost metrics, and system health • Manage model registries and CI/CD pipelines enabling automated and reproducible model deployments • Define engineering best practices and contribute to platform scalability in a fast-moving startup environment

Benefits

• Take ownership of critical infrastructure powering a rapidly scaling AI-native cloud platform • Build foundational ML inference systems from the ground up in a high-growth, well-funded startup • Work at the intersection of distributed systems, GPU computing, and sustainable cloud architecture • Gain deep expertise in next-generation AI infrastructure and large-scale model serving systems • Influence core engineering decisions and define best practices that will scale with the company. • Pragmatike is committed to a fair, transparent, and inclusive recruitment process. We do not discriminate based on age, disability, gender, gender identity or expression, marital or civil partner status, pregnancy or maternity, race, religion or belief, sex, or sexual orientation. • In accordance with GDPR, your personal data will be processed lawfully, fairly, and securely, and used solely for recruitment purposes, including sharing it with our client(s) for employment consideration.