Director of Systems Reliability
A company is looking for a Director of Systems Reliability & Field Resilience.
Key Responsibilities
Identify root causes of service issues across various system components and coordinate resolutions
Build and lead a multidisciplinary global systems reliability team to investigate failures and drive improvements
Manage the on-call process, defining SLAs and establishing a robust incident management system
Required Qualifications
8+ years of experience in a technical engineering or operations role, with at least 3 years in a leadership position
Deep experience with complex distributed systems and system debugging, triage, and root cause analysis
Strong understanding of hardware/software integration, especially in cloud-connected devices
Proven success in leading incident response or SRE functions and managing on-call teams
Strong data and dashboarding skills to translate operational data into actionable insights
A company is looking for a Director of Systems Reliability & Field Resilience.
Key Responsibilities
Identify root causes of service issues across various system components and coordinate resolutions
Build and lead a multidisciplinary global systems reliability team to investigate failures and drive improvements
Manage the on-call process, defining SLAs and establishing a robust incident management system
Required Qualifications
8+ years of experience in a technical engineering or operations role, with at least 3 years in a leadership position
Deep experience with complex distributed systems and system debugging, triage, and root cause analysis
Strong understanding of hardware/software integration, especially in cloud-connected devices
Proven success in leading incident response or SRE functions and managing on-call teams
Strong data and dashboarding skills to translate operational data into actionable insights