Cluster Manager
A company is looking for a Member of Technical Staff (Cluster Manager).
Key Responsibilities
Ensure reliability, performance, and scalability of compute infrastructure
Design, build, and maintain tools for system operations
Monitor system performance and implement solutions for issues
Required Qualifications
Experience managing large-scale distributed systems
Strong scripting and automation skills (e.g., Python, Bash)
Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes)
Understanding of cloud computing platforms (e.g., AWS, GCP, Azure)
Experience with HPC/GPU cluster management tools is strongly desired
A company is looking for a Member of Technical Staff (Cluster Manager).
Key Responsibilities
Ensure reliability, performance, and scalability of compute infrastructure
Design, build, and maintain tools for system operations
Monitor system performance and implement solutions for issues
Required Qualifications
Experience managing large-scale distributed systems
Strong scripting and automation skills (e.g., Python, Bash)
Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes)
Understanding of cloud computing platforms (e.g., AWS, GCP, Azure)
Experience with HPC/GPU cluster management tools is strongly desired