Grafana Labs - Senior AI Engineer
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• Experience with vector databases or retrieval pipelines (Pinecone, Weaviate, ChromaDB, Qdrant, pgvector) • Familiarity with marketing or sales platforms (Salesforce, Customer.io, HubSpot, Marketo, Outreach) • Experience with frontend frameworks (React, Slack Block Kit) for building user-facing AI tool interfaces • Observability tooling for AI systems (LangSmith, Weights & Biases, custom evaluation frameworks) • Experience with workflow orchestration platforms (n8n, Temporal, Prefect, Airflow) • Familiarity with Model Context Protocol (MCP) or similar standards for connecting AI systems to data sources • Prior work automating marketing, sales, or customer success workflows in a B2B SaaS environment • Active in open-source communities. Grafana is built on OSS and we value engineers who share that DNA
Responsibilities
• Agentic Systems & AI Infrastructure • Own end-to-end development of multi-agent AI systems, from architecture and implementation through testing, deployment, and ongoing operation • Build modular, composable agentic systems using orchestration frameworks (LangChain, CrewAI, Anthropic MCP, or similar) that operate 24/7 across teams • Develop reusable agentic skills that agents invoke across interfaces (Slack, dashboards, internal apps, CLIs) • Implement observability and feedback loops including logging, performance metrics, prompt iteration, model evaluation, and cost management • Establish governance and compliance standards for AI workflows including access controls, audit trails, PII handling, and human-in-the-loop escalation paths • Systems Integration & Backend Services • Build MCP servers, APIs, CLIs, and microservices connecting AI models to business systems (BigQuery, Slack, CRMs, email, calendars, analytics tools) • Architect data flows for retrieval-augmented generation (RAG), connecting LLMs to internal knowledge bases, customer data, and real-time business context • Build serverless or containerized services (GCP Cloud Functions, Cloud Run) that scale with usage and integrate with Grafana's cloud infrastructure • Automation & Workflow Enablement • Partner with RevOps, Demand Generation, Regional Marketing, and SDR teams to scope high-impact automation problems, identify bottlenecks, and build solutions with measurable business outcomes • Design and deploy workflows using orchestration tools (n8n, Workato, or custom platforms) with CI/CD, testing, and production reliability standards • Build systems designed for self-service with documentation, playbooks, and enablement materials that let partner teams operate independently • We invest heavily in developer productivity. You'll have access to AI coding assistants (Claude Code, Gemini CLI, OpenAI Codex, and others of your choice within security guidelines). We encourage pragmatic AI-assisted development paired with strong code review and quality standards. • What Makes You a Great Fit • 8+ years of software engineering experience with depth in backend development, systems integration, or data/analytics engineering • 2+ years hands-on experience applying LLMs/AI to production workflows, not just prototypes • Strong proficiency in Python and JavaScript/Node.js with Git-based workflows, code review practices, and testing discipline • Hands-on experience with LLM frameworks and patterns including prompt engineering, RAG, function calling/tool use, structured output parsing, and evaluation • Experience building and operating multi-agent systems at scale including agent decomposition, orchestration patterns (sequential chains, router/dispatcher, parallel fan-out), state management, and production monitoring • You diagnose business problems before writing code. You think in workflows and outcomes, not just functions. • Deep familiarity with Google Cloud Platform, BigQuery, and serverless/containerized services (Cloud Functions, Cloud Run) • Understanding of LLM failure modes and production mitigations including confidence thresholds, fallback logic, human escalation, and cost/latency management • Proven ability to identify high-leverage problems, push back on low-impact requests, and deliver end-to-end with minimal direction • Fluent with AI-assisted development tools (GitHub Copilot, Cursor, Claude Code). You use AI to build AI systems • Clear technical communicator who can explain complex systems in simple terms to both engineers and business stakeholders
Benefits
• 100% Remote, Global Culture - As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose. • 100% Remote, Global Culture - • Scaling Organization – Tackle meaningful work in a high-growth, ever-evolving environment. • Scaling Organization • Transparent Communication – Expect open decision-making and regular company-wide updates. • Transparent Communication • Innovation-Driven – Autonomy and support to ship great work and try new things. • Innovation-Driven • Open Source Roots – Built on community-driven values that shape how we work. • Open Source Roots • Empowered Teams – High trust, low ego culture that values outcomes over optics. • Career Growth Pathways – Defined opportunities to grow and develop your career. • Career Growth Pathways • Approachable Leadership – Transparent execs who are involved, visible, and human. • Approachable Leadership • Passionate People – Join a team of smart, supportive folks who care deeply about what they do. • Passionate People • In-Person onboarding - We want you to thrive from day 1 with your fellow new ‘Grafanistas’ to learn all about what we do and how we do it. • In-Person onboarding • Balance is Key - We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect. *We will comply with local legislation where applicable. • Balance is Key
No credit card. Takes 10 seconds.