wagey.ggwagey.ggv1.0-e93b95d-4-May
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs/Backend Engineer Role/Grafana Labs - Staff Backend Engineer - Databases Tempo | Canada | Remote
Pro members applied to this job 36 hours before you saw itGet Pro ›
Grafana Labs

Grafana Labs - Staff Backend Engineer - Databases Tempo | Canada | Remote

Remote - USA$186k - $224k6d ago
RemoteStaffNAArtificial IntelligenceSoftwareBackend EngineerStaff EngineerRustC++Team LeadershipGoSQL

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Requirements

• Strong software craftsmanship. You write clean, robust, performant software that others can maintain, and you know when to optimize vs. when to ship. • Strong software craftsmanship. • Strong Go, or a path to it. We write Tempo in Go. Deep experience in other systems languages (Rust, C, C++) translates well. • Strong Go, or a path to it. • Operational mindset. You’ve owned production services, carried a pager, reduced toil, and treated SLOs as a product feature, not a chore. • Operational mindset. • Customer focus and pragmatism. You break complex problems into short feedback loops: analyze, design, deliver an MVP, learn, iterate. • Customer focus and pragmatism. • Leadership through writing and collaboration. You lead through design docs, reviews, and shipped code, not hierarchy. You communicate clearly in a fully remote, asynchronous environment. • Leadership through writing and collaboration. • Experience with tracing, OpenTelemetry, or large-scale observability systems. • Experience designing query languages, SQL/TraceQL-like engines, or APIs intended to be consumed programmatically (by services or agents). • Experience with columnar storage formats (e.g., Parquet) or purpose-built on-disk formats for analytical workloads. • Experience operating multi-tenant, multi-cell SaaS infrastructure at scale on Kubernetes. • Experience building for AI/LLM consumers: structured APIs, metadata/discovery endpoints, deterministic outputs, evaluation harnesses. • Open-source contribution or maintainership, and comfort engaging a community in the open. • Experience as an on-call user of Grafana, Prometheus, Loki, or Tempo in a previous role (or on a homelab). • Experience in a fully remote, globally distributed team. • How we work • How we work

Responsibilities

• Make Grafana Cloud Traces “just work” for customers by eliminating rough edges, confusing limits, and hidden failure modes. • Achieve operational excellence at scale as we grow from close to 50 cells today into triple digits this year, with autoscaling, parameterized rollouts, and aggressive toil reduction. • Evolve Tempo into a platform enabler: higher-density APIs, trace aggregation, TraceQL metrics math, and machine/LLM-friendly interfaces that downstream products and agents can build on. • Push performance further: faster query latency at hundreds of MB/s ingestion and performant 30-day query ranges to match competitors. • Prepare Tempo for an agent-driven world: larger, burstier, higher-cardinality workloads, and new categories of AI-powered workflows, such as assistant-driven triage and “why is this slow?”- style investigations. • As a Staff Engineer on Tempo, you will set technical direction on the hardest problems in our roadmap and raise the bar across the team. • Lead multi-quarter technical initiatives from problem framing through rollout, e.g., trace aggregation APIs, Limitless Tempo, autoscaling cells and customer limits, or query engine improvements. • Lead multi-quarter technical initiatives • Own the architecture of core Tempo components: ingestion, storage, query, and metrics generation. Drive design reviews, make sharp trade-offs on performance, cost, and complexity, and document the “why” for the team. • Own the architecture • Design APIs for humans and agents. Shape the next generation of Tempo’s interfaces (structured, deterministic, discoverable) so that Act 3 products, LLM-driven assistants, and external integrators can build on Tempo reliably. • Design APIs for humans and agents. • Drive operational excellence. Own outcomes against concrete SLOs (P99 write latency, incident recurrence, TCO per ingested GB) and push the team toward Zero Ops through automation, parameterized rollouts, and actionable alerts. • Drive operational excellence. • Partner with Product and sibling teams. Work closely with PMs and with App Observability, Asserts, Drilldown, and Grafana Assistant teams to understand how Tempo gets consumed and to ship what unblocks them. • Partner with Product and sibling teams. • Mentor engineers. Raise the engineering bar through code review, design feedback, pairing on hard problems, and writing that leaves the team smarter than you found it. • Mentor engineers. • Participate in on-call for the services you help build, and be a force multiplier in incident response and post-incident learning. • Participate in on-call • Contribute to open source. Tempo is OSS. You will engage the community, review external contributions, and help steer the project in the open. • Contribute to open source. • We invest heavily in developer productivity. You can use modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget so you can iterate quickly without unnecessary friction. • We encourage pragmatic AI-assisted development: faster prototyping, test generation, refactors, documentation, and incident follow-ups—always paired with strong code review and quality standards. • You’ll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro). • Example problems you could work on • These are the kinds of projects landing in 2026. Any one of them is a Staff-sized problem: • Trace aggregation and higher-density APIs: extend TraceQL metrics, design LLM-friendly response types, and make Tempo a first-class data source for Grafana’s AI assistant. • Trace aggregation and higher-density APIs: • Autoscaling end to end: customer limits and Tempo cells, with hysteresis, predictive scaling for spikes, and safe scale-down. • Autoscaling end to end: • Agent-scale ingestion and query: guardrails for bursty, high-cardinality, agent-generated workloads. • Agent-scale ingestion and query: • Query performance: new data formats, smarter query pipelines, targeted optimizations for common Drilldown and Traces workflows, and 30-day query ranges. • Query performance: • Rollouts and multi-cell operations: parameterized rollouts, push-button deploys, and the tooling to grow safely into triple-digit cell counts without a proportional increase in alert noise. • Rollouts and multi-cell operations: • Limits and self-service: drive customer-facing configuration and observability so escalations trend toward zero. • Limits and self-service: • What Makes You a Great Fit: • Technical leadership. A track record of leading complex, multi-quarter initiatives that spanned design, delivery, and operations, and made the teams around you better. • Technical leadership. • Deep systems experience. Substantial hands-on experience building and operating distributed data systems in production: ingestion pipelines, storage engines, query execution, or similar.

Benefits

• 100% Remote, Global Culture - As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose. • 100% Remote, Global Culture - • Scaling Organization – Tackle meaningful work in a high-growth, ever-evolving environment. • Scaling Organization • Transparent Communication – Expect open decision-making and regular company-wide updates. • Transparent Communication • Innovation-Driven – Autonomy and support to ship great work and try new things. • Innovation-Driven • Open Source Roots – Built on community-driven values that shape how we work. • Open Source Roots • Empowered Teams – High trust, low ego culture that values outcomes over optics. • Career Growth Pathways – Defined opportunities to grow and develop your career. • Career Growth Pathways • Approachable Leadership – Transparent execs who are involved, visible, and human. • Approachable Leadership • Passionate People – Join a team of smart, supportive folks who care deeply about what they do. • Passionate People • In-Person onboarding - We want you to thrive from day 1 with your fellow new ‘Grafanistas’ to learn all about what we do and how we do it. • In-Person onboarding • Balance is Key - We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect. *We will comply with local legislation where applicable. • Balance is Key

Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact·FAQ·Wagey on X