March 12
🏢 In-office - San Francisco
• Iterate very quickly with product teams to ship the latest optimizations to Anyscale platform, Anyscale Endpoints, and various open source offerings. • Work closely with research teams on LLM engines like vLLM, TensorRT-LLM • Follow the latest state-of-the-art in the open source and the research community, implementing and extending best practices
• Prior experience working on GPUs / CUDA • Solid understanding of operating systems and/or networking fundamentals and experience in such optimizations • Familiarity with deep learning and deep learning frameworks (e.g. PyTorch) • ML Systems knowledge • Experience training deep learning models • Contributions to deep learning frameworks (PyTorch, TensorFlow) • Contributions to deep learning compilers (Triton, TVM, MLIR) • Experience using Ray
• Stock Options • Healthcare plans, with premiums covered by Anyscale at 99% • 401k Retirement Plan • Wellness stipend • Education stipend • Paid Parental Leave • Flexible Time Off • Commute reimbursement • 100% of in-office meals covered
Apply Now