wagey.ggwagey.ggv1.0-0f5e85e-22-May
Browse Tech JobsCompaniesFeaturesPricingFAQs
Log InGet Started Free
Jobs/Principal Role/Graphcore - Principal Reliability Scientist
Graphcore

Graphcore - Principal Reliability Scientist

Austin, Texas, United States; Hsinchu City, Hsinchu City, Taiwan; US - Milpitas; 臺北市, Taipei, Taiwan1mo ago
In OfficePrincipalEMEASemiconductorsManufacturingPrincipalSite Reliability EngineerData AnalysisLiquid

Upload My Resume

Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT

Apply in One Click
Apply in One Click

Requirements

• Strong background in reliability engineering or reliability science within semiconductor, hardware or complex systems environments • Experience of physics-of-failure approaches in high-performance computing, AI hardware or related domains • Experience with reliability modelling, experimental design and statistical data analysis • Proven ability to work with and interpret experimental reliability data to drive engineering decisions • Experience with key reliability metrics such as MTBF, MTTR, RAS and failure rate analysis • Ability to operate effectively in complex, cross-functional environments with multiple stakeholders • Strong problem-solving skills with the ability to lead technically challenging investigations independently • Excellent communication skills, with the ability to influence design and operations teams using data-driven insights • Preferred Qualification: • · Experience with liquid cooling systems, fluid dynamics or thermally complex hardware environments · Knowledge of soft error mechanisms and SER modeling· Experience contributing to reliability strategy, processes or tooling improvements

Responsibilities

• · Define and refine reliability requirements across silicon, board and system levels, working in partnership with research and design teams · Apply advanced reliability methodologies to highly innovative systems, including challenges associated with liquid-cooled architectures and fluid dynamics · Design and execute experiments to generate high-quality reliability and performance data, ensuring statistical rigour and relevance · Analyse experimental, field and manufacturing data to quantify reliability metrics such as MTBF, MTTR, RAS characteristics and soft error rates (SER) · Use data-driven insights to inform product design trade-offs, reliability targets and spares provisioning strategies · Collaborate with chip, board and system design teams to influence architecture and component selection based on reliability considerations · Support development of system-level reliability models incorporating thermal, mechanical and fluid behaviour · Lead complex root cause investigations into reliability issues, driving corrective and preventative actions across teams · Contribute to the evolution of reliability tools, processes and best practices within the organisation · Communicate complex reliability concepts, risks and recommendations clearly to a wide range of stakeholders

Get Started Free

No credit card. Takes 10 seconds.

Privacy·Terms··Contact·FAQ·Wagey on X