Sr. DevOps
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• 5+ years of professional experience in a DevOps, SRE, or infrastructure engineering role. • 5+ years • Deep expertise in containerization and orchestration, specifically Kubernetes (design, deployment, and troubleshooting) and Docker. • Kubernetes • Docker • Strong proficiency in managing infrastructure in both Cloud (e.g., AWS, GCP, Azure) and On-Premise environments. • Cloud • On-Premise • Expert-level administration skills in Linux and strong working knowledge of Windows Server environments. • Linux • Windows Server • Proven experience with Infrastructure as Code (IaC) and Configuration Management tools (e.g., Terraform, Ansible). • High proficiency in scripting and automation using Python and Bash. • Python • Extensive experience with monitoring and observability platforms, especially Datadog (or comparable tools like Prometheus/Grafana, New Relic). • Datadog • Hands-on experience deploying and managing technologies related to Large Language Models (LLMs), such as utilizing LiteLLM, OpenRouter, or setting up and managing LLM serving endpoints. • LiteLLM • OpenRouter • Experience with specific Kubernetes distributions (e.g., K3s, Rancher, OpenShift). • Familiarity with network configuration, firewalls, and security best practices for hybrid environments. • Experience in MLOps workflows and related tools (e.g., MLflow, Kubeflow). • Certifications such as CKA, CKAD, or relevant cloud provider certifications. • Please note that this job description is intended to provide a general overview of the position and does not include an exhaustive list of responsibilities and qualifications • At Archer we aim to attract, retain, and motivate talent that possess the skills and leadership necessary to grow our business. We drive a pay-for-performance culture and reward performance that supports the Company’s business strategy. For this position we are targeting a base pay between 133,400 - 200,000. Actual compensation offered will be determined by factors such as job-related knowledge, skills, and experience.
Responsibilities
• Design, deploy, and manage highly available, scalable infrastructure using Kubernetes and Docker across public cloud (e.g., AWS, GCP, Azure) and on-premise data centers. • Develop and maintain robust Configuration Management solutions for consistent environment provisioning and management with tools like Ansible or Terraform. • Implement and manage CI/CD pipelines to facilitate rapid, reliable, and automated software releases. • Administer and troubleshoot operating systems in both Linux and Windows environments. • Implement and optimize observability practices using monitoring tools such as Datadog for logging, tracing, and alerting. • Operational deployment, scaling, and maintenance of LLM infrastructure leveraging technologies like LiteLLM or OpenRouter. • Automate repetitive tasks and system operations primarily with Bash and Python scripting languages. • Collaborate closely with development, MLOps, and security teams to ensure the infrastructure meets product requirements and compliance standards. • Participate in an on-call rotation for service reliability and incident response.
Benefits
• Equity options mentioned as part of the benefits package • Paid Time Off (PTO) included in the compensation plan • Insurance coverage provided to employees • Perks such as remote work options available for eligible team members