Software Data Engineer
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• Collaborate with Machine Learning, Full-stack engineers and Science to solve complex document mining challenges, helping us capture and model additional scientific experiments • Scale data pipelines to allow our data to go from research to platform quickly and reliably • Work with sources that contain both semi-structured and unstructured data • Use your experience to help define and apply best practices for a broad platform of technologies in a cloud-based environment • Architect and maintain robust data pipelines that ingest diverse sources and utilize LLMs for high-fidelity entity extraction into structured formats • Implement evaluation frameworks to monitor the accuracy, drift, and hallucination rates of extraction models within the production pipeline. • Lead or consult the authoring of engineering design proposals following the unified Platform Stream roadmap at BenchSci • Leverage a deep understanding of the business context and the team’s goals to unlock independent technical decisions in the face of open-ended requirements • Proactively identify new opportunities (from both internal and external sources) and advocate for and implement improvements to the current state of projects • Respond with urgency and drive urgency in own team to operational issues, owning resolution within one's sphere of responsibility • Challenge the status quo and propose newertechnologies or ways of working • A degree in Computer Science/Engineering or a related field within science • 3+ years experience working as a software developerin the industry • Proficient with Python • Proficient with SQL • Experience using LLMs for structured data extraction • Experience with event-driven architecture with Pub/Sub • A track record in building high-quality, maintainable code • Experience with cloud computing (for example: GCP, Azure, AWS) • ML/Data science exposure • Worked with Auth0, Terraform • Have experience with data warehouse solutions like BigQuery, and databases including AlloyDB and Spanner • Have experience with agentic driven development and AI-based tools like Cursor or Claude Code • Have experience with building ConversationalAI solutions
Responsibilities
• Collaborate with Machine Learning, Full-stack engineers and Science to solve complex document mining challenges, helping us capture and model additional scientific experiments • Scale data pipelines to allow our data to go from research to platform quickly and reliably • Work with sources that contain both semi-structured and unstructured data • Use your experience to help define and apply best practices for a broad platform of technologies in a cloud-based environment • Architect and maintain robust data pipelines that ingest diverse sources and utilize LLMs for high-fidelity entity extraction into structured formats • Implement evaluation frameworks to monitor the accuracy, drift, and hallucination rates of extraction models within the production pipeline. • Lead or consult the authoring of engineering design proposals following the unified Platform Stream roadmap at BenchSci • Leverage a deep understanding of the business context and the team’s goals to unlock independent technical decisions in the face of open-ended requirements • Proactively identify new opportunities (from both internal and external sources) and advocate for and implement improvements to the current state of projects • Respond with urgency and drive urgency in own team to operational issues, owning resolution within one's sphere of responsibility • Challenge the status quo and propose newertechnologies or ways of working
Benefits
• We know compensation is an important part of choosing your next role. The range shown reflects our target hiring range, informed by market data, internal equity, and the role’s current scope. Often the mid-range is where we tend to fall, but individual offers may vary based on experience, skills, and the role scope. • A great compensation package that includes BenchSci equity options • A robust vacation policy plus an additional vacation day every year • Company closures for 14 more days throughout the year • Flex time for sick days, personal days, and religious holidays • Comprehensive health and dental benefits • Annual learning & development budget • A one-time home office set-up budget to use upon joining BenchSci • An annual lifestyle spending account allowance • Generous parental leave benefits with a top-up plan or paid time off options • The ability to save for your retirement coupled with a company match!