Valency Systems Inc. - Senior AI-Native DevOps / Operations Engineer (AMER)
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• 8+ years of progressively increasing responsibility operating important production systems • Demonstrated success shipping and running high-reliability systems in production • Deep AWS experience in real production environments • Strong background in software engineering and testing, not just infrastructure administration • Experience designing or significantly improving CI/CD systems and release processes • Experience building or operating logging, monitoring, alerting, and observability systems • Experience improving production reliability, performance, and operational response • Comfort with container-based systems and orchestration platforms • Strong hands-on ability in at least some of: Python, Go, Elixir, CDK • Strong judgment around guardrails, operational safety, and change management • Ability to work in ambiguity and build systems that do not yet fully exist • Strongly Preferred • Strongly Preferred • Startup experience, especially in fast-scaling environments • Experience at high-scale SaaS companies that have gone through periods of rapid growth • Experience owning or materially influencing platform engineering functions • Experience with cost engineering / FinOps in AWS-heavy environments • Experience designing systems for compliance-oriented environments • Experience with SOC 2, ISO 27001, or FedRAMP-related operational requirements • Experience evaluating or implementing modern observability and workflow tracing stacks • Experience creating human-in-the-loop approval systems for sensitive production workflows
Responsibilities
• Own and improve CI/CD pipelines, release controls, and deployment workflows • Build and maintain highly reliable AWS-based production systems • Improve observability across logs, metrics, traces, events, and workflow state • Instrument platform behavior so system issues, regressions, and slowdowns are quickly visible and actionable • Create operational analytics that help close the loop between engineering, product, and customer experience • Drive cost engineering and infrastructure efficiency as the system scales • Build safer operating patterns for agent-assisted code changes and operational actions • Implement testing, validation, approval, and rollback mechanisms that reduce operational risk • Improve batch, queue, cache, and job-processing reliability and monitoring • Support incident response, root cause analysis, postmortems, and follow-through • Partner with external vendors and partners when needed • Help define platform standards, reliability practices, and operational maturity across the company
Benefits
• You will help define how an AI-native research platform is actually operated in production • You will work on systems that connect agents, researchers, product behavior, and infrastructure reality • You will have broad scope across infrastructure, reliability, analytics, and operational guardrails • You will help build the production foundation for a category-defining company at an early stage • You will not inherit a frozen stack; you will help choose and build the right one • Compensation, Benefits & EquityWe offer a competitive salary, benefits, and meaningful equity in a company building something important from the ground up. • Work Authorization: Candidates must be legally authorized to work in the United States. • Work Authorization:
No credit card. Takes 10 seconds.