Introduction to Streaming Multiprocessors

Brief introduce streaming multiprocessors and explain what happens when a kernel is issued from the hardwares' view.

Gluon: Linear Layout and Improvement of Triton

OpenAI released tool recently, built on Triton to fit Blackwell's warp-based execution model.