Tripledot Studios - Senior DevOps Engineer
Requirements
• 5+ years in the industry as a DevOps, SRE Engineer or or Platform Engineer, ideally in gaming, mobile apps, or other high-scale digital products. • Strong hands-on experience with Kubernetes in production — not just running workloads on it, but operating it. Cost-aware infrastructure decision-making. • Solid Terraform (or OpenTofu) experience, with a track record of keeping IaC sustainable as it grows. • Proven experience in delivering data and AI/ML solutions in production for both AWS and a working knowledge of GCP or willingness to come to speed quickly. Bonus if this experience is within the gaming industry. • Comfortable owning CI/CD pipelines with common tools (GitHub Actions, GitLab CI, ArgoCD, Jenkins, or similar). • Hands-on experience with cloud and Kubernetes security fundamentals, IAM/RBAC, secrets management (ex. Vault, AWS Secrets Manager, External Secrets), network policies, and integrating security checks into CI/CD pipelines. • Strong instincts for observability, monitoring, and alerting, you've built dashboards and alerts that teams actually rely on, and you know the difference between a useful page and noise. Hands-on with tools like Prometheus, Grafana, Datadog, CloudWatch, or similar. Solid incident response experience. • The current data and AI/ML stack uses open source tools like Airflow, Trino, Spark, and Kubeflow. Familiarity with deploying these tools, as well as tweaking them for improved performance, is a bonus. Understanding of ML Ops best practices and common architectures is also a bonus. • Hands-on knowledge of Python and/or other scripting languages. • Experience creating infrastructure for both traditional and modern agentic data-intensive systems is a bonus. • Focus on innovation, coupled with a mindset of continuous learning and curiosity to explore emerging AI technologies. The successful candidate will have an agile, hands-on approach to prototyping and validation, and ability to Get Stuff Done in a fast-paced environment. • Excellent communication and collaboration skills necessary for working effectively with both technical and non-technical teams. Understanding how to drive results with key business stakeholders.
Responsibilities
• Improve and maintain a scalable, speedy and reliable data and ML platform to support AI/ML initiatives within group AI, ensuring models move seamlessly from research to production. • Support group IT to provide reliable access to open source AI models and ensure safe reliable access to AI productivity tools. • Create and maintain proper monitoring and alerting tools to ensure our systems can provide the correct SLA and SLOs defined by the stakeholders. • Implement and advocate for engineering best practices, including CI/CD, infrastructure as code like Terraform, usage of version control, testing, observability, while keeping costs in mind. • Ensure standardized cross-studio access & security to enable timely data access and ingestion (AWS and Google Cloud). • Enable the teams with different environments for testing new setups, tools, without disrupting the day-to-day operations of the team and production workflows. • Track usage for all our deployed applications, and identify areas of improvement, making the best use of resources. • Keep up with the relevant technologies, best practices, especially related to AI productivity tools, continuously emerging in the industry.
Benefits
• You will be part of a fun mobile gaming company aiming to embrace the future of AI-driven creativity and exploring where the industry is moving. • You will be instrumental in shaping the backbone of the AI/ML and IT systems that will power solutions that will spread throughout the whole group. • You will operate in an environment that values an experimental mindset, focusing on learning opportunities and pioneering generative game creation. • Working at Tripledot (in London) • 25 days holiday: Enjoy 25 days of paid holiday, in addition to bank holidays to relax and refresh throughout the year. • 25 days holiday: • Hybrid Working: We work in the office 3 days a week, Tuesdays and Wednesdays, and a third day of your choice. • Hybrid Working: • 20 days fully remote working: Work from anywhere in the world, 20 days of the year. • 20 days fully remote working: • Regular company events and rewards: Join in regular events and rewards that celebrate cultural events, our achievements and our team spirit. Recent highlights have been: Thames River Cruise, London Dungeon and Summer Parties in Regents Park. • Continuous Professional Development: Propel your career with continuous opportunities for professional development. • Continuous Professional Development: • Private Medical Cover: Have peace of mind with private medical cover, ensuring your health is in good hands. • Private Medical Cover: • Life Assurance & Critical Illness Cover: Financial protection for you and your loved ones. • Life Assurance & Critical Illness Cover: • Health Cash Plan: Benefit from a health cash plan that contributes to your medical expenses. • Health Cash Plan: • Dental Cover: Flash your best smile with our dental cover. • Dental Cover: • Family Forming Support: Receive vital support on your family forming/ fertility journey with our support program [subject to policy] • Family Forming Support: • Employee Assistance Program: Anytime you need it, tap into confidential, caring support with our Employee Assistance Program, always here to lend an ear and a helping hand. • Employee Assistance Program: • Cycle to Work Scheme: Make the most of our Cycle to Work Scheme for a green and healthy commute. • Cycle to Work Scheme: • Pension Plan: Secure your future with our contributory pension plan.
Apply in one click
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT