Cluster Manager

A company is looking for a Member of Technical Staff (Cluster Manager). Key Responsibilities Ensure reliability, performance, and scalability of compute infrastructure Design, build, and maintain tools for system operations Monitor system performance and implement solutions for issues Required Qualifications Experience managing large-scale distributed systems Strong scripting and automation skills (e.g., Python, Bash) Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes) Understanding of cloud computing platforms (e.g., AWS, GCP, Azure) Experience with HPC/GPU cluster management tools is strongly desired

Mar 6, 2025 - 00:38
 0
Cluster Manager
A company is looking for a Member of Technical Staff (Cluster Manager). Key Responsibilities Ensure reliability, performance, and scalability of compute infrastructure Design, build, and maintain tools for system operations Monitor system performance and implement solutions for issues Required Qualifications Experience managing large-scale distributed systems Strong scripting and automation skills (e.g., Python, Bash) Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes) Understanding of cloud computing platforms (e.g., AWS, GCP, Azure) Experience with HPC/GPU cluster management tools is strongly desired