Model Performance Engineer

A company is looking for a Model Performance Engineer to optimize inference performance for AI models on their platform. Key Responsibilities Optimize inference performance by minimizing latency and maximizing throughput Experiment continuously to achieve industry-leading performance for various models Impact the performance of applications serving millions of users globally Required Qualifications Experience with state-of-the-art inference stacks such as PyTorch, TensorRT, or vLLM Open to candidates with any level of experience, including new graduates Ability to work in a fast-paced environment and adapt to new challenges Willingness to work in-person in New York City or remotely if exceptionally qualified Visa sponsorship available for qualified candidates

May 29, 2025 - 07:40

A company is looking for a Model Performance Engineer to optimize inference performance for AI models on their platform. Key Responsibilities Optimize inference performance by minimizing latency and maximizing throughput Experiment continuously to achieve industry-leading performance for various models Impact the performance of applications serving millions of users globally Required Qualifications Experience with state-of-the-art inference stacks such as PyTorch, TensorRT, or vLLM Open to candidates with any level of experience, including new graduates Ability to work in a fast-paced environment and adapt to new challenges Willingness to work in-person in New York City or remotely if exceptionally qualified Visa sponsorship available for qualified candidates