Clara - Senior Platform Engineer - Remote
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• 5+ years of experience in DevOps/SRE/Platform Engineering • Mastery of cloud providers (preferably AWS) • Solid experience with Kubernetes and microservices architectures • Expertise in CI/CD tools (GitHub Actions, GitLab CI, Jenkins, ArgoCD) • Proficiency in Infrastructure as Code (Terraform, Pulumi, CloudFormation) • Experience with containers (Docker, Kubernetes, ECS/EKS) • Advanced scripting skills (Python, Bash, Go) • Knowledge of observability tools (Prometheus, Grafana, ELK, Datadog, New Relic) • Experience with Backstage.io or similar developer portal platforms • FinOps knowledge and cloud cost optimization • Experience in FinTech organizations or highly regulated environments • Familiarity with AI Ops tools (AIOps platforms, ML-based monitoring) • Cloud certifications (AWS Solutions Architect, CKA, etc.) • Experience with service mesh (Istio, Linkerd) • Compliance and security knowledge (PCI-DSS, SOC2) • Automation obsession: If something is done twice, it should be automated • Automation obsession • Product mindset: Treat internal platform as a product with "customers" (developers) • Product mindset • Ability to abstract complexity: Make the complex simple for end users • Ability to abstract complexity • Effective communication: Document clearly, create runbooks, and educate teams • Effective communication • Problem solving: Systems thinking to solve problems at their root • Problem solving • Continuous improvement mindset: Constantly seek ways to optimize and simplify • Continuous improvement mindset • What You'll Build • Self-Service Portal • Create new services from templates • Provision ephemeral environments in seconds • Configure alerts and dashboards with clicks • Request access and permissions with automated approvals • Intelligent Pipelines • Detects changes and runs only relevant tests • Auto-deploys to production with quality gates • Auto-rollback on failures • Provides instant feedback to developers • On-Demand Environments • Creates complete environments per PR in <5 minutes • Sanitized copy of production data • Unique URLs for testing and demos • Automatic cleanup when PR is closed • Proactive Observability • Alerts only when action is required • Automatically suggests root causes • Auto-remediation of known issues • Customized dashboards per team/service
Responsibilities
• Design and maintain a developer portal (Backstage.io or similar) as the central hub for resource management • Build abstractions and APIs that enable developers to provision resources without manual intervention • Implement self-service workflows for environment creation, configurations, and permissions • Create reusable templates and blueprints for services, repositories, and pipelines • CI/CD & Automation • Design, implement, and optimize highly automated CI/CD pipelines • Reduce build and deployment times through intelligent caching, parallelization, and optimizations • Implement GitOps and continuous deployment with automated rollback capabilities • Automate testing (unit, integration, e2e) in pipelines with clear reporting • Create advanced deployment strategies (blue-green, canary, feature flags) • Ephemeral Environments • Design and implement ephemeral/preview environment solutions for each PR/branch • Automate the complete lifecycle: creation, configuration, and cleanup • Optimize costs through auto-scaling, scheduling, and garbage collection of unused resources • Integrate ephemeral environments with code review and testing workflows • Observability & Alerting • Implement intelligent alerting systems with noise reduction and event correlation • Configure dashboards and SLI/SLO metrics for critical services • Establish automated runbooks and auto-remediation for common incidents • Integrate observability (logs, metrics, traces) into the developer portal • Infrastructure as Code & Security • Maintain and evolve infrastructure as code (Terraform, CloudFormation, etc.) • Implement automated security controls (policy as code, security scanning) • Manage secrets, configurations, and access securely and with full auditability • AI/ML Ops Integration • Explore and implement AI tools for resource optimization and failure prediction • Automate operational tasks using ML (anomaly detection, capacity planning, incident classification) • Evaluate and adopt emerging AI Ops tools
Benefits
• At Clara, you’ll have the autonomy, speed, and support to make meaningful impact — not just on your team, but on how organizations are run across Latin America. • Competitive salary and stock options (ESOP) from day one • Multicultural team with daily exposure to Portuguese, Spanish, and English (our corporate language) • Portuguese, Spanish, and English • Annual learning budget and internal accelerated development paths • High-ownership environment: we move fast, learn fast, and raise the bar — together • Smart, ambitious teammates — low ego, high impact • Flexible vacation and hybrid work model focused on results • hybrid work model • If you’re ready for growth, ownership, and impact — apply now and help us redefine B2B finance in Latin America. • Clara’s Hybrid Policy • Claridians in a hybrid mode split their time between working from the office, talking to or visiting customers, or working from home. This hits a balance between bringing people together for in-person collaboration and learning from each other, while supporting flexibility about how to do this in a way that makes sense for each individual and team. • We don't enforce a minimum number of days for most roles, but you're expected to spend time at the office organically, and be at the office most days during your ramp-up or when required by your leader.
No credit card. Takes 10 seconds.