Member of Technical Staff - Machine Learning Researcher/Engineer, Performance Optimization
Design and optimize Liquid Foundation Models (LFMs) on GPUs. Write high-performance GPU kernels for inference workloads and optimize alternative architectures.
We are looking for an ML Research/Engineer (Performance Optimization) to design and optimize our Liquid Foundation Models (LFMs) on GPUs. This is a highly technical role that will expose you to state-of-the-art foundation model technology.
Desired Experience
- CUDA
- CUTLASS
- C/C++
- PyTorch/Triton
You're A Great Fit If
- You have experience writing high-performance, custom GPU kernels for training or inference.
- You have an understanding of low-level profiling tools and how to tune kernels with such tools.
- You have experience integrating GPU kernels into frameworks like PyTorch, bridging the gap between high-level models and low-level hardware performance.
- You have a solid understanding of memory hierarchy and have optimized for compute and memory-bound workloads.
- You have implemented fine-grain optimizations for a target hardware, e.g. targeting tensor cores.
What You'll Actually Do
- Write high-performance GPU kernels for inference workloads.
- Optimize alternative architectures used at Liquid across all model parameter sizes.
- Implement the latest techniques and ideas from research into low-level GPU kernels.
- Continuously monitor, profile, and improve the performance of our inference pipelines.
What You'll Gain
- Hands-on experience with state-of-the-art technology at a leading AI company.
- Deeper expertise in machine learning systems and performance optimization.
- Opportunity to bridge the gap between theoretical improvements in research and effective gains in practice.
- A collaborative, fast-paced environment where your work directly shapes our products and the next generation of LFMs.