Neo.Tax - Senior Data Scientist + Machine Learning Engineer

Remote - Palo Alto, California, United States$190k - $210k+ Equity1mo ago

Remote Senior NA Artificial Intelligence Data Analytics Data Scientist Senior Data Scientist Python Pandas scikit-learn Customer Training SQL

Requirements

• MS/PhD in Computer Science, Statistics, Mathematics, or a related quantitative field, or equivalent practical experience. • 6+ years of industry experience as a Data Scientist / Applied Scientist / ML Engineer shipping ML to production (or equivalent). • Strong proficiency in Python and the modern data/ML ecosystem (NumPy/Pandas, scikit-learn, PyTorch or TensorFlow). • Python • Strong understanding of statistical modeling, experimentation, and evaluation (metrics, confidence intervals, A/B testing, bias/variance, error analysis). • statistical modeling, experimentation, and evaluation • Experience building data pipelines and working with SQL and relational databases. • data pipelines • Experience deploying and maintaining models in production (batch or real-time), including monitoring and iteration; comfortable owning operational concerns (reliability, latency, cost). • Ability to operate with high ownership in ambiguous environments; strong communication and collaboration skills. • Ability to effectively design and implement solutions without the help of AI (more info on how we use AI at Neo.Tax below). • Experience with LLM evaluation, synthetic data generation, RAG, or tool-augmented agents. • Experience with information extraction and document understanding. • Experience with distributed data processing (e.g., Spark, Beam) and/or workflow engines. • Experience working at early-stage, venture-backed startups. • Ownership-oriented: You want autonomy and responsibility. You're not looking for someone to hand you a detailed spec and check your work. • Ownership-oriented: • Proactive communicator: You identify and raise risks early, summarize what you've heard, and ask clarifying questions rather than making assumptions. • Proactive communicator: • Pragmatic over idealistic: You evaluate solutions based on trade-offs, not dogma. • Pragmatic over idealistic: • Product-minded: You care about shipping improvements that move customer outcomes, not just training models. • Product-minded: • Comfortable with ambiguity: You can dive into unfamiliar data and systems and figure out what needs to happen. • Comfortable with ambiguity: • What It’s Like to Work Here • The data science + ML engineering team consists of four full-time team members (including you) and one part-time employee. You’ll work closely together and collaborate daily with product and engineering to ship new features. • We're early adopters of AI tooling. Everyone on the team uses Claude Code or OpenAI Codex daily (including in sales and GTM), and we actively experiment with new AI workflows. We're looking for someone who sees AI as a force multiplier for skilled practitioners, not a replacement for fundamentals. • We're early adopters of AI tooling. • 9:00am — You start work. Check your email and Slack. Review notes from yesterday where you left off. • 9:15am — You post an asynchronous update in Slack discussing the Linear tickets you worked on yesterday and what you plan to focus on today. You highlight progress and blockers. You raise a risk related to the estimated timeline of your current project. You at-mention your counterpart on the Engineering team to notify them of a change you’re planning to make to a particular model in the prediction-service and ask if they’d like to discuss further. • 9:45am — Join weekly Data Science Tea Time to discuss technical topics relevant to the whole team. • 10:45am — Review two pull requests from your teammates. • 12:00pm — Lunch break. • 1:00PM — Meet with Firas, the CTO who’s also your manager, for your bi-weekly 1:1. Ask some questions about the quarterly goals and new enterprise customers we’re onboarding. Provide him with feedback on a new evaluation methodology we just implemented for a class of AI models in production. • 1:30pm — Work on your assigned project. Start a Slack huddle after 30 minutes of investigation to discuss some unfamiliar code with another data scientist. • 5:00pm — Take some notes on where you should pick up tomorrow. End the day. • If you’re starting a new project soon, you’d participate in a kick-off meeting with all the appropriate stakeholders to ensure you understand the requirements and that the project spec has all the necessary information before you start implementing so that you can deliver what’s expected. • What success looks like in 90 days • You understand the fundamentals of our technology stack and how ML fits into the product end to end. • understand the fundamentals of our technology stack • You understand our business domain (R&D tax credits and software capitalization), customer workflows, and what “good” looks like. • understand our business domain • You’ve shipped multiple measurable quality improvements to production models/pipelines, with monitoring in place. • multiple measurable quality improvements • You’ve established or improved evaluation infrastructure (datasets, metrics, dashboards) that makes iteration faster. • evaluation infrastructure • Who should not apply • People who prefer research-only roles and do not want to ship production systems. • People who avoid ambiguity and require complete, detailed specs before making progress. • People who cannot evaluate solutions based on trade-offs and real-world constraints. • People who rely on AI as a substitute for core statistical/ML fundamentals.

Responsibilities

• Own ML/AI problem spaces end-to-end: Define success metrics, create baselines, iterate on approaches, and drive projects from prototype to production. • Own ML/AI problem spaces end-to-end: • Model development: Build and improve models spanning classification, information extraction, entity resolution, clustering, ranking, anomaly detection, and forecasting. • Model development: • LLM systems: Design and evaluate prompt + retrieval + tool-calling pipelines; improve quality through datasets, labeling, and systematic evaluation. • LLM systems: • Data foundations: Define datasets, labeling strategies, and data quality checks; build features that generalize across customer contexts. • Data foundations: • Experimentation and evaluation: Design offline evaluations and online experiments; build dashboards and monitoring to detect regressions. • Experimentation and evaluation: • Production ML engineering: Build and operate training/inference pipelines (batch and/or online), model serving, feature/data pipelines, and monitoring/alerting for quality, latency, and cost. • Production ML engineering: • Partner with engineering: Collaborate on productionization, scalability, reliability, latency, and cost; contribute directly to model-serving or batch pipelines as needed. • Partner with engineering: • Cross-functional collaboration: Work with product, engineering, and customer-facing teams to understand workflows and translate real customer pain into ML deliverables. • Cross-functional collaboration: • Technical communication: Write clear specs and postmortems, document trade-offs, and communicate progress, risks, and decisions. • Technical communication:

Benefits

• $190K – $210K • Offers Equity • Stock Option Plan (Equity) • Health Care Plans (Medical, Dental, Vision, Short-term Disability) • 90% coverage for individual + family • Health & Wellness subsidy • Retirement Plan (401k) • Paid Time Off (Vacation, Sick & Public Holidays) • Family Leave (Maternity, Paternity) • Work From Home option • Additional Details • Still interested? Read on for more information! • Series B preparation underway: You'd be joining at a pivotal stage where early employees have meaningful impact on the company's trajectory. • Series B preparation underway: • Real traction: Multiple profitable months and 4x revenue growth year-over-year. Our Q1 was the most successful quarter in the company’s history! This isn't a speculative bet. • Real traction: • Big Customers: Adobe, Brex, CapitalOne, Mercury, Notion, Thomson Reuters, and Whoop, to name a few. • Big Customers: • Small team, big ownership: You will own meaningful parts of the ML system that directly impact customers and revenue. • Small team, big ownership: • Greenfield problems: Model diverse customer data, scale pipelines, and automate an industry that's barely been touched by software. • Greenfield problems: