vynca - Site Reliability Engineer
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• Experience: Three to five (3–5) years of experience in Site Reliability Engineering, DevOps Engineering, Platform Engineering, Cloud Infrastructure Engineering, or similar infrastructure-focused roles, preferably within healthcare, SaaS, or high-growth technology environments. • Education: Bachelor's degree in Computer Science, Information Systems, Software Engineering, or a related technical field; equivalent professional experience will also be considered. • Strong hands-on experience operating production workloads within AWS environments. • Proven experience managing infrastructure as code using Terraform, including module development, state management, and deployment automation. • Experience operating and supporting production Kubernetes environments. • Hands-on experience deploying and managing applications using Helm. • Experience working with distributed systems, event-driven architectures, or event-sourcing platforms, including concepts such as partitioning, event ordering, replay, and fault tolerance. • Experience establishing and managing observability practices including monitoring, logging, tracing, alerting, and incident response. • Strong understanding of Linux systems administration, networking, cloud architecture, and distributed systems fundamentals. • Experience designing, implementing, and maintaining CI/CD pipelines and deployment automation. • Strong problem-solving skills with the ability to troubleshoot complex infrastructure and application issues. • Excellent written and verbal communication skills with the ability to collaborate effectively across technical and non-technical teams. • High level of ownership, accountability, and initiative with a proactive approach to reliability and operational excellence. • Ability and willingness to participate in an on-call rotation supporting production systems. • Strong programming or scripting experience with Python, Go, or similar languages. • Experience with observability platforms such as Prometheus, Grafana, Datadog, CloudWatch, SigNoz, or OpenTelemetry. • Experience with GitOps tools such as ArgoCD or Flux. • Experience managing databases such as PostgreSQL, MySQL, Redshift, or ClickHouse. • Experience implementing secrets management solutions such as AWS Secrets Manager or HashiCorp Vault. • Experience supporting healthcare technology platforms or other highly regulated environments. • Familiarity with data infrastructure technologies including Snowflake, Redshift, and ETL/ELT pipelines. • Experience with database performance tuning and optimization. • At this time we are only considering applicants in the following states: Arizona, California, Colorado, Florida, Georgia, Illinois, Nevada, North Carolina, Oregon, Texas, and Washington.
Responsibilities
• Design, provision, and manage AWS infrastructure using Terraform as the source of truth. • Operate, maintain, and scale production workloads running on Kubernetes. • Package, deploy, and manage applications using Helm and infrastructure automation tools. • Build, operate, and improve distributed and event-driven systems, including event sourcing, partitioning, event ordering, replay, and failure recovery mechanisms. • Define, monitor, and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets to balance reliability and engineering velocity. • Develop automation for deployment, scaling, monitoring, incident response, and operational workflows to reduce manual effort and improve system resilience. • Own platform observability by implementing and maintaining metrics, logging, tracing, monitoring, and alerting solutions. • Lead incident response efforts, facilitate blameless postmortems, and drive long-term corrective actions that improve system reliability. • Partner with Product and Engineering teams on capacity planning, performance optimization, and resilient system design. • Implement and maintain security best practices to support HIPAA, SOC 2, and other compliance requirements. • Participate in an on-call rotation and provide operational support for production systems. • Vaccination Requirement: Employees in patient, client, or customer-facing roles must be vaccinated against influenza. Requests for religious or medical accommodations will be considered but may not always be approved. • Employment Eligibility: Compliance with federal law requires identity and work eligibility verification using E-Verify upon hire.
No credit card. Takes 10 seconds.