Search Jobs in CA

Inflection AI

Website LinkedIn All Job Openings

We are a small, friendly and multi-disciplinary AI studio creating a personal AI for everyone.

11 - 50

💰 $1.3G Venture Round on 2023-06

Member of Technical Staff, Research Engineer (Inference)

July 24

🏢 In-office - Bay Area

💵 $200k - $350k / year

⏰ Full Time

🔴 Lead

🛂 H1B Visa Sponsor

Cloud

Docker

Kubernetes

PyTorch

Apply Now

Inflection AI

Website LinkedIn All Job Openings

We are a small, friendly and multi-disciplinary AI studio creating a personal AI for everyone.

11 - 50

💰 $1.3G Venture Round on 2023-06

Description

• As part of Inflection’s commitment to deploying high-performance models for enterprise applications, our inference team ensures that these models run efficiently and effectively in real-world scenarios. • Research engineers in this role focus on optimizing model inference processes, reducing latency, and improving throughput without compromising model performance, ensuring robust deployment in enterprise environments.

Requirements

• Have experience with deploying and optimizing LLMs for inference, both in cloud and on-prem environments. • Are adept at using tools and frameworks for model optimization and acceleration, such as ONNX, TensorRT, or TVM. • Enjoy troubleshooting and solving complex problems related to model performance and scaling. • Have a deep understanding of the trade-offs involved in model inference, including hardware constraints and real-time processing requirements. • Are proficient with PyTorch and familiar with infrastructure management tools like Docker and Kubernetes for deploying inference pipelines.

Benefits

• Unlimited paid time off • Parental leave and flexibility for all parents and caregivers • Generous medical, dental and vision plans for US employees • Compliance with country-specific benefits for non-US employees • Visa sponsorship for new hires • Avenues for personal growth such as coaching, conference attendance, or specific trainings

Apply Now