Komodo Health - Senior Data Engineer, Knowledge & Information
Requirements
• Strong hands-on experience building, operating, and debugging production-grade data pipelines at scale in AWS, with sharp instincts for data quality, reliability, root-cause analysis, and production troubleshooting. • Advanced Python and SQL skills; experience with Airflow or similar orchestration tools and Spark or comparable distributed processing frameworks. • Ability to communicate technical trade-offs clearly and collaborate across engineering, product, and data teams. • Comfort using AI-assisted engineering tools for productivity, debugging, documentation, and technical exploration. • AI-Augmented Engineering Expectations: • You will be expected to leverage AI-augmented engineering tools, such as ChatGPT, Gemini, or Claude, to improve productivity and technical decision-making. This may include using AI to generate and refine code, accelerate documentation, automate test case creation, debug complex issues, explore unfamiliar technical concepts, and assess architectural trade-offs and risks. • Additional skills and experience we’d prioritize (nice to have)… • Healthcare data experience is a plus, but not required. • Ability to optimize high-scale data architectures for performance, cost, versioning, and large-volume productization; experience applying AI or agentic workflows to engineering, data quality, delivery, or operations. • Proven success in high-growth or ambiguous environments that require balancing architecture, speed, and quality. • Open to NYC/SF Hybrid or Remote • The pay range for each job posting reflects a minimum and maximum range of annual base pay that we reasonably expect to pay for this position within the US. We carefully consider multiple business-related factors when determining compensation, including job-related skills, work experience, geographic work location, relevant training and certifications, business needs and market demands. • The starting annual base pay for this role is listed below. This position may be eligible for performance-based bonuses as determined in the Company’s sole discretion and in accordance with a written agreement or plan. This role may also be eligible for equity awards. In addition, this role is eligible for benefits including, but not limited to, comprehensive health, dental, and vision insurance; flexible time off and holidays; 401(k) with company match; disability insurance and life insurance; and leaves of absence in accordance with applicable state and local laws and regulations and company policy. • San Francisco Bay Area and New York City: • $176,000 - $238,000 USD • All Other US Locations: • $153,000 - $207,000 USD • Komodo's AI Standard • At Komodo, we're not just witnessing the AI revolution – we're leading it. This is a pivotal moment in time, where being first to market with AI transforms industries and sets the bar. We've already established industry leadership in leveraging AI to revolutionize healthcare, and we expect every team member to contribute. AI here isn't optional; it's foundational. We expect you to integrate AI into your daily work – from summarizing documents to automating workflows and uncovering insights. This isn't just about efficiency; it's about making every moment more meaningful, building on trust in AI, and driving our collective success. • Komodo Health has a hybrid work model with hubs in San Francisco, New York City, and Chicago. Roles vary — some can be performed from anywhere in the country, others are scoped to a specific region, and some are based near one of our hubs. For hub-based Dragons, we're building intentional in-office rhythms alongside the flexibility that's core to how we work. Whatever your setup, expectations will always be clear before you join. • Equal Opportunity Statement
Responsibilities
• Build, operate, and optimize large-scale production data pipelines using Python, SQL, Airflow, cloud infrastructure, and distributed processing frameworks — including robust data quality checks, validation, lineage, observability, monitoring, and alerting. • Design and scale agentic data acquisition and extraction systems for complex, unstructured public data sources; develop LLM-powered Human-in-the-Loop (HITL) pipelines for data extraction and curation. • Transform healthcare claims, EHR, non-claims-based, and reference datasets into trusted, performant Healthcare Map data products and serving-ready data assets. • Contribute to system design, architecture, code quality, testing, documentation, CI/CD, and rotational production support — including debugging complex data, system, and performance issues across computationally intensive workflows. • Partner with Data Product Quality, Product, Platform, and Engineering teams to translate healthcare data needs into scalable technical solutions that enable downstream analytics, product, and AI/ML use cases.
Apply in one click
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT