GenAI Staff Machine Learning Engineer, Performance Optimization at Databricks

GenAI Staff Machine Learning Engineer, Performance Optimization

Explore and analyze performance bottlenecks in ML training and inference, design and implement libraries, and build tools for performance profiling.

You will:

Explore and analyze performance bottlenecks in ML training and inference
Design, implement and benchmark libraries and methods to overcome aforementioned bottlenecks
Build tools for performance profiling, analysis, and estimation for ML training and inference
Balance the tradeoff between performance and usability for our customers
Facilitate our community through documentation, talks, tutorials, and collaborations
Collaborate with external researchers and leading AI companies on various efficiency methods

We look for:

Hands on experience the internals of deep learning frameworks (e.g. PyTorch, TensorFlow) and deep learning models
Experience with high-performance linear algebra libraries such as cuDNN, CUTLASS, Eigen, MKL, etc.
General experience with the training and deployment of ML models
Experience with compiler technologies relevant to machine learning
Experience with distributed systems development or distributed ML workloads
Hands on experience with writing CUDA code and knowledge of GPU internals (Preferred)
Publications in top tier ML or System Conferences such as MLSys, ICML, ICLR, KDD, NeurIPS (Preferred)