Model Performance Engineer
A company is looking for a Model Performance Engineer to optimize inference performance for AI models on their platform.
Key Responsibilities
Optimize inference performance by minimizing latency and maximizing throughput
Experiment continuously to achieve industry-leading performance for various models
Impact the performance of applications serving millions of users globally
Required Qualifications
Experience with state-of-the-art inference stacks such as PyTorch, TensorRT, or vLLM
Open to candidates with any level of experience, including new graduates
Ability to work in a fast-paced environment and adapt to new challenges
Willingness to work in-person in New York City or remotely if exceptionally qualified
Visa sponsorship available for qualified candidates
A company is looking for a Model Performance Engineer to optimize inference performance for AI models on their platform.
Key Responsibilities
Optimize inference performance by minimizing latency and maximizing throughput
Experiment continuously to achieve industry-leading performance for various models
Impact the performance of applications serving millions of users globally
Required Qualifications
Experience with state-of-the-art inference stacks such as PyTorch, TensorRT, or vLLM
Open to candidates with any level of experience, including new graduates
Ability to work in a fast-paced environment and adapt to new challenges
Willingness to work in-person in New York City or remotely if exceptionally qualified
Visa sponsorship available for qualified candidates