Graphcore - Senior Machine Learning Engineer (Large Systems)
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• MLOps for Kubernetes-based clusters • Building production systems with large language models • Efficient computing based on low-precision arithmetic. • Experience writing C++/Triton/CUDA kernels for performance optimisation of ML models. • Familiarity with HPC systems and networking including Infiniband, NVLink, RoCE technologies. • Have contributed to open-source projects or published research papers in relevant fields. • Knowledge of cloud computing platforms. • Keen to present, publish and deliver talks in the AI community.
Responsibilities
• Implement latest machine learning models and optimise them for performance and accuracy, scaling to 1000s of accelerators. • Test and evaluate new internal software releases, provide feedback to software engineering teams, make necessary code fixes, and conduct code reviews. • Benchmark models and key ML techniques to identify performance bottlenecks and improve model efficiency. • Design and conduct experiments on novel AI methods, implement them and evaluate results. • Collaborate with Research, Software, and Product teams to define, build, and test Graphcore’s next generation of AI hardware. • Engage with AI community and keep in touch with the latest developments in AI. • Candidate Profile Essential: • Candidate Profile • Essential: • Bachelor/Master's/PhD or equivalent experience in Machine Learning, Computer Science, Maths, Data Science, or related field. • Proficiency in deep learning frameworks like PyTorch/JAX. • Strong Python or C++ software development skills • Expertise in deep learning from model training to optimisation and evaluation. • Experience in distributed training or inference of ML models across 64+ accelerators. • Capable of designing, executing and reporting from ML experiments. • Developed deep understanding of performance bottlenecks and how to overcome them. • Ability to move quickly in a dynamic environment • Enjoy cross-functional work collaborating with other teams. • Strong communicator - able to explain complex technical concepts to different audiences. • Desirable: • MLOps for Kubernetes-based clusters • Building production systems with large language models • Efficient computing based on low-precision arithmetic. • Experience writing C++/Triton/CUDA kernels for performance optimisation of ML models. • Familiarity with HPC systems and networking including Infiniband, NVLink, RoCE technologies. • Have contributed to open-source projects or published research papers in relevant fields. • Knowledge of cloud computing platforms. • Keen to present, publish and deliver talks in the AI community.
Benefits
• In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments. • Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications
Similar Jobs
No credit card. Takes 10 seconds.