telus-digital - AI Research Engineer

Remote - Brazil3w ago

Remote Mid LATAM Artificial Intelligence Research Engineer AI Engineer JAX Python Instructional Design DVC JSON

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click

Requirements

• Educational Background: PhD or Master's degree in Computer Science, Computational Linguistics, Machine Learning, or a related quantitative field. • Industry Experience: 3+ years of hands-on experience in applied NLP research or ML engineering, ideally within a research lab or a data-centric AI environment. • Ambiguity Tolerance: The ability to operationalize subjective concepts (e.g., "Creativity," "Safety," "Truthfulness") into concrete, annotatable guidelines. • Research Communication: Proven track record of translating complex technical requirements into clear instructions for non-technical stakeholders (e.g., explaining "reasoning traces" to domain expert annotators). • Deep theoretical and practical understanding of Transformer architectures (Decoder-only GPT styles, Encoder-Decoder T5 styles), Attention mechanisms, Positional Embeddings, and Tokenization strategies (BPE, SentencePiece). • Extensive experience with the post-training stack: Supervised Fine-Tuning (SFT) and Preference Alignment techniques including RLHF (PPO) and DPO (Direct Preference Optimization). • Experience with noisy label handling, crowd-sourcing aggregation models (e.g., Dawid-Skene), active learning sampling strategies, and identifying semantic bias in large-scale datasets. • You understand Function/Tool Calling, ReAct frameworks, and how to evaluate "trajectory" quality (reasoning steps) rather than just final output accuracy. • You have experience designing LLM-as-a-Judge pipelines, pairwise comparison (Side-by-Side) systems, and reference-free metrics to measure Faithfulness, Coherence, and Safety. • Proven ability to design "Data Evolution" pipelines (e.g., Evol-Instruct, Self-Instruct). You understand techniques for Knowledge Distillation and how to mitigate "Model Collapse" when training on synthetic data. • Familiarity with adversarial testing. You can design prompts to stress-test safety filters (jailbreaking) and understand the trade-offs between helpfulness and harmlessness (False Refusal rates). • Expert-level fluency in Python and deep learning libraries (PyTorch, TensorFlow, JAX). • Ability to write robust scripts for processing massive text corpora (JSONL manipulation, RegEx, deduplication at scale). Experience with data versioning tools (DVC, LakeFS) is a plus. • Experience running controlled ablation studies.

Responsibilities

• Experiment Design: Design and implement complex experiments to test new hypotheses, including defining evaluation protocols and baseline comparisons. • Independent Research: Independently manage a research sub-task from start to finish, analyzing and interpreting results to draw clear conclusions. • Code Contribution: Contribute high-quality, reusable code to the team's "reference implementation" repository. • Benchmark Contribution: Actively contribute to improving the internal benchmark by identifying data gaps, proposing new evaluation metrics, or adding new models for comparison.