1 post tagged #nvidia.
#nvidia
NVIDIA CUDA 13.3 (May 26) adds C++ tile programming: declarative tile abstractions replace manual shared memory, synchronization, and indexing. CompileIQ autotuning uses evolutionary algorithms to tune tile sizes and memory layout per kernel (up to 15% speedup on GEMM/attention). Works on Hopper and all other supported architectures.
// We use Google Analytics to understand which posts get traction and how the wro.cpp community grows. Anonymised, no ads, and you can decline -- the site works either way.