LILT (Production) - Software Engineering & DevOps AI Rater/Evaluator

Remote - Argentina, Brazil, Mexico...2mo ago

Remote LATAM Cloud Computing Artificial Intelligence AI Engineer

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• Types of AI Evaluation Work • Depending on project demands, work may include: • Software engineering and infrastructure content evaluation • Code correctness and reasoning assessment • DevOps, CI/CD, and cloud architecture evaluation • Security and reliability-focused red-teaming • Ongoing model monitoring and regression testing • What We Look For • What We Look For • Deep domain expertise in software engineering, DevOps, or infrastructure • Strong technical judgment and ability to apply criteria consistently • Comfort working with structured evaluation workflows • Ability to explain reasoning clearly, especially in complex or high-risk technical scenarios • Reliability, professionalism, and respect for quality standards • Engagement Model • Engagement Model • Contract-based, flexible participation • Project-based work with clear expectations and timelines • Opportunities for recurring work based on performance and demand • Compensation communicated upfront per project or task type • Native or professional fluency in one or more supported languages is required • Supported languages span 30+ global languages • Language-specific nuance is assessed through screening and task-based evaluation, not separate job descriptions • English fluency is required for guidelines, feedback, and collaboration • AI is changing how the world communicates — and LILT is leading that transformation. • LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world. • Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise. • Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at https://lilt.com/legal/privacy. • At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at [email protected].

Responsibilities

• Evaluate AI outputs related to software engineering, DevOps, and infrastructure topics • Perform structured scoring, comparison, classification, and judgment tasks • Assess technical correctness, completeness, security implications, and best-practice alignment • Identify hallucinations, incorrect code, unsafe recommendations, or misleading system guidance • Ideal Background • Ideal Background • Software engineers, site reliability engineers, DevOps engineers, or platform engineers • Experience with production systems, CI/CD pipelines, cloud infrastructure, or distributed systems • Strong attention to detail and comfort working with structured evaluation criteria • Track B: Software Engineering & DevOps AI Evaluator (Senior Track) • Evaluators provide higher-level technical oversight and help shape how evaluation is performed. • Validate and refine evaluation rubrics and edge-case handling • Perform adjudication where raters disagree • Conduct error analysis and qualitative reviews of model behavior • Partner with LILT research, product, and customer teams on evaluation design • Support red-teaming, security review, and model readiness assessments • Ideal Background • Ideal Background • Senior software engineers, DevOps leads, SREs, or technical architects • Experience defining technical standards, reviewing complex edge cases, or advising on system design and reliability • Ability to clearly explain nuanced technical reasoning and tradeoffs

Benefits

• Your expertise helps ensure that AI systems: • Provide accurate and safe technical guidance • Align with real-world engineering and DevOps best practices • Are reliable, secure, and trustworthy across languages

Get Started Free

No credit card. Takes 10 seconds.

Requirements

Responsibilities