August 27
š¢ In-office - Bay Area
ā¢ Optimize the performance of Deep Learning training workload on NVIDIA GPU systems on a large scale. ā¢ Optimize the latency of model inference and model pre- and post-processing on AV onboard systems.
ā¢ PhD in CS/CE/EE, or equivalent, in industry experience. ā¢ 4+ years of experience with C++ and Python programming. ā¢ Knowledge of Computer Architecture and Operating Systems. ā¢ Background with parallel programming, preferably on GPUs. ā¢ Capability to analyze/profile workloads and do detective work to reduce system latency. ā¢ A Track-record of delivery CPU or GPU latency optimization for real-world engineering systems.
ā¢ A fun, supportive and engaging environment ā¢ Infrastructures and computational resources to support your work. ā¢ Opportunity to work on cutting edge technologies with the top talents in the field. ā¢ Opportunity to make significant impact on the transportation revolution by the means of advancing autonomous driving ā¢ Competitive compensation package ā¢ Snacks, lunches, dinners, and fun activities
Apply Now