wagey.ggwagey.ggv1.0-0f5e85e-22-May
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs/Infrastructure Engineer Role/kraken.com - Senior AI Compute Infrastructure Engineer
kraken.com

kraken.com - Senior AI Compute Infrastructure Engineer

Remote - Canada, Portugal, Spain...$104k - $104k1mo ago
RemoteSeniorEMEACryptocurrencyFintechInfrastructure EngineerSenior DevOps EngineerCUDARustC++GoPython

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Requirements

• 5+ years of infrastructure engineering experience, with significant time spent on GPU compute, ML infrastructure, distributed systems, high-performance computing, or large-scale production platforms. • Hands-on experience operating GPU clusters or accelerator-backed infrastructure in production or production-like environments, including scheduling, orchestration, utilization monitoring, and cost optimization. • Strong systems engineering fundamentals across Linux, networking, storage, containers, Kubernetes, distributed runtimes, and production debugging. • Experience with ML serving frameworks such as vLLM, Triton Inference Server, TensorRT, TorchServe, KServe, Ray Serve, or equivalent systems. • Proficiency in Python for infrastructure automation, tooling, debugging, integration, and operational workflows. • Practical understanding of performance tradeoffs across batching, concurrency, memory usage, GPU utilization, model size, latency, throughput, availability, and cost. • Track record of optimizing compute costs while maintaining clear performance, reliability, and availability expectations. • Experience building observable systems with useful metrics, logs, traces, dashboards, alerts, and incident workflows. • Comfortable working in high-stakes, always-on environments where uptime, throughput, correctness, and operational discipline are critical. • Clear communicator who can translate infrastructure tradeoffs for researchers, product teams, platform engineers, security stakeholders, and engineering leadership. • Experience at a frontier AI lab, hyperscaler, high-frequency trading firm, research platform, or high-scale ML organization. • Familiarity with custom silicon or specialized accelerators such as TPUs, AWS Trainium, Gaudi, or similar platforms. • Background in capacity planning, procurement input, reserved capacity strategy, cloud accelerator economics, or GPU fleet cost management. • Experience with distributed training frameworks such as DeepSpeed, Megatron-LM, FSDP, Ray, or equivalent systems. • Experience debugging CUDA, NCCL, kernel, driver, runtime, memory, networking, or low-level performance issues. • Experience with Rust, C++, Go, CUDA, or other systems languages used for performance-critical infrastructure. • Crypto, financial services, trading infrastructure, or security-sensitive production infrastructure experience. • Unless a specific application deadline is stated in the job posting, applications are accepted on an ongoing basis. • Please note, applicants are permitted to redact or remove information on their resume that identifies age, date of birth, or dates of attendance at or graduation from an educational institution. • We consider qualified applicants with criminal histories for employment on our team, assessing candidates in a manner consistent with the requirements of the San Francisco Fair Chance Ordinance. • Kraken is powered by people from around the world and we celebrate all Krakenites for their diverse talents, backgrounds, contributions and unique perspectives. We hire strictly based on merit, meaning we seek out the candidates with the right abilities, knowledge, and skills considered the most suitable for the job. We encourage you to apply for roles where you don't fully meet the listed requirements, especially if you're passionate or knowledgable about crypto! • We may ask candidates to complete job-related skills or work-style assessments as part of our hiring process. These assessments are designed to evaluate competencies relevant to the role and are applied consistently across candidates for similar positions. Assessment results are considered alongside other relevant information, such as experience and interviews, and are not the sole basis for any employment decision.

Similar Jobs

stellarentertainmentstellarentertainment - IT Support & Infrastructure Engineer3d ago
·Remote - Guildford, UK
RemoteEMEAHigher EducationInfrastructure EngineerHead of Information SecurityLearning & DevelopmentMicrosoft 365Linux
niantic-spatialniantic-spatial - Computer Vision Resarch Engineer3d ago
·Remote - London·£96k - £107k/year/year + Equity
RemoteEMEAArtificial IntelligenceRoboticsComputer Vision EngineerPythonC++CUDA
inherentinherent - Member of Technical Staff (Infrastructure Engineer, Training and Inference Systems)3d ago
·London
In OfficeEMEAStaffArtificial IntelligenceInfrastructure EngineerStaff EngineerJAXPythonRustC++CUDA
Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact·FAQ·Wagey on X