Paper "Neptune" ♆ accepted at PLDI'26!
Our work “Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs” is recently accepted by PLDI 2026.
Neptune is a new AI compiler framework that brings forward a novel paradigm of doing operator fusion, and combines loop and tile optimizations for higher performance. Give Neptune a mathematical description of attention: Neptune re-discovers FlashAttention and gives you high-performance implementations of that, all automatically.
See our GitHub repository, and try it out!