wagey.ggwagey.gg
Open Tech JobsCompaniesPricing
Log InGet Started Free
© 2026 Dominic Morris. All rights reserved.·Privacy·Terms·
Jobs/Jenkins Jobs/DevOps / Platform Engineer (Fintech + AI Infrastructure)

DevOps / Platform Engineer (Fintech + AI Infrastructure)

OnHiresRemote - European Union, Ukraine1w ago
RemoteSeniorEMEACryptocurrencyFintechPaymentsPlatform EngineerJenkinsVaultPerformance ManagementLinuxKubernetes

Upload My Resume

Drop here or click to browse · PDF, DOCX, TXT

Apply in One Click

Requirements

  • Experience: 5+ years in DevOps, SRE, or Platform Engineering (Fintech experience is mandatory).
  • Core Systems: Deep expertise in Linux, networking (TCP/IP, DNS, TLS, routing), and complex troubleshooting.
  • Kubernetes: Production experience with K8s, Helm, Ingress, autoscaling, network policies, and resource management.
  • CI/CD: Proficiency in GitHub Actions, GitLab CI, or Jenkins.
  • Observability: Hands-on experience with Prometheus + Grafana, logging (Loki/ELK), and tracing (OpenTelemetry/Jaeger).
  • AI Infrastructure: Experience with GPU clusters and ML stacks (NVIDIA drivers, CUDA, MIG, GPU monitoring).
  • Data Stores: Production-level operation of Postgres, Redis, Kafka, or RabbitMQ.
  • Security First: Practical knowledge of Vault, KMS, RBAC, OPA/Gatekeeper/Kyverno, Trivy, and SBOM.
  • Technology Stack
  • Cloud: AWS, Hetzner, DigitalOcean
  • Orchestration: Docker, Kubernetes
  • IaC: Terraform, Ansible
  • CI/CD: GitHub Actions / GitLab CI
  • Observability: Prometheus, Grafana, Loki / ELK, OpenTelemetry
  • Security: HashiCorp Vault, KMS, RBAC, Policy-as-Code
  • AI Serving: Triton, vLLM, custom inference services

Responsibilities

  • 1. AI / MLOps (Production for Models)
  • GPU Infrastructure: Deploy and maintain high-performance GPU clusters.
  • AI Lifecycle: Manage the full lifecycle of AI services: inference deployment (Triton, vLLM, custom services), autoscaling, and seamless rollout/rollback strategies.
  • Data Management: Manage model storage, artifact versioning, caching, and high-speed data access via S3-compatible storage.
  • Observability: Monitor performance metrics including latency, throughput, error budgets, resource limits, and cost/performance ratios.
  • 2. PSP / Fintech Reliability
  • High Availability: Ensure fault tolerance for payment services (SLA/SLO management, redundancy, Disaster Recovery planning, and regular recovery testing).
  • Fintech-Grade Security: Implement secrets management, HSM/managed KMS integration, infrastructure hardening, and audit logging.
  • Secure CI/CD: Build secure pipelines featuring artifact signing, vulnerability scanning, policy gates, and isolated environments.
  • 3. Crypto Infrastructure
  • Node Operations: Deploy and maintain crypto nodes (Full, Archive, RPC) across various networks.
  • Automation: Automate node updates, synchronization monitoring, and health checks.
  • Storage & Performance: Manage disk I/O (IOPS/RAID), protect RPC endpoints, and manage access controls.
  • Metrics: Monitor for sync lags, chain forks, and consensus issues.

Similar Jobs

[Job - 27404] Senior Database Platform Engineer, Colombia
15h ago
CI&TCI&T·Colombia
In OfficeSeniorLATAMCloud ComputingManufacturingPlatform EngineerDBASQLDocumentationSQL ServerAWSAzureSentryReportingSynapse
Member of Technical Staff, DevOps / Infrastructure Engineering
20h ago
FirstPrinciplesFirstPrinciples·Hybrid
In OfficeStaffWWCloud ComputingArtificial IntelligenceDevOps EngineerChefGoRustBashPythonRecords ManagementKubernetesDockerResource AllocationPrometheusELKGrafanaVaultLinuxTerraformAWSJenkinsPulumiChefAnsible
Software Development Engineer, Platform Foundations Team
20h ago
Platform SciencePlatform Science·Remote - USA·$106k – $155k/year + Equity
RemoteMidNACloud ComputingSoftwareSoftware EngineerPlatform EngineerDocumentationGoRESTJiraAWSTerraformMentoringCloudFormationMySQLPostgreSQL
Cloud & DevOps Engineer (Egypt Residency Required)
20h ago
RackspaceRackspace·Egypt - Giza - Hybrid - USA *
In OfficeMidNACloud ComputingCloud EngineerPrivacy ManagerJenkinsAnsibleDockerKubernetesAWSTerraformLinuxBashPythonHelmKustomizeDatadogPrometheusGrafanaSplunk
Senior DevOps Engineer (Azure), TSCM-43181, 43527
20h ago
EleksEleks·Remote - Europe
RemoteSeniorEMEACloud ComputingSoftwareSenior DevOps EngineerBashAzurePerformance ManagementSQLDockerKubernetesTerraformJenkinsGitAnsibleRelease ManagementGovernance

Stop filling. Start chilling.Start chilling.

Get Started Free

No credit card. Takes 10 seconds.