Learning accuracy and conceptual understanding assessment
Benchmarking and comparative model analysis
Red-teaming for misleading or harmful educational content
Ongoing model monitoring and regression testing
What We Look For
Deep domain expertise in education, instructional design, or learning sciences
Strong judgment and ability to apply criteria consistently
Comfort working with structured evaluation workflows
Ability to explain reasoning clearly, especially in instructional or learner-facing scenarios
Reliability, professionalism, and respect for quality standards
Engagement Model
Contract-based, flexible participation
Project-based work with clear expectations and timelines
Opportunities for recurring work based on performance and demand
Compensation communicated upfront per project or task type
Native or professional fluency in one or more supported languages is required
Supported languages span 30+ global languages
Language-specific nuance is assessed through screening and task-based evaluation, not separate job descriptions
English fluency is required for guidelines, feedback, and collaboration
AI is changing how the world communicates — and LILT is leading that transformation.
LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world.
Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise.
Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at https://lilt.com/legal/privacy.
At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at [email protected].
Responsibilities
Evaluate AI outputs related to educational, instructional, and learning content
Perform structured scoring, comparison, classification, and judgment tasks
Assess pedagogical accuracy, clarity, appropriateness, and learning effectiveness
Identify hallucinations, misleading explanations, factual errors, or unsafe educational guidance
Ideal Background
Educators, instructional designers, curriculum developers, or learning professionals
Experience with teaching, curriculum design, assessment, or educational technology
Strong attention to detail and comfort working with structured evaluation criteria
Track B: EdTech AI Evaluator (Senior Track)
Evaluators provide higher-level domain oversight and help shape how evaluation is performed.
Validate and refine evaluation rubrics and edge-case handling
Perform adjudication where raters disagree
Conduct error analysis and qualitative reviews of model behavior
Partner with LILT research, product, and customer teams on evaluation design
Support red-teaming, educational quality review, and model readiness assessments