"Finally, we get better cudaMallocAsync handling," wrote a prominent contributor on a popular LLM repository. "The memory management in 12.5 was getting tight with these new 400-billion-parameter models. 12.6 looks like it buys us another year of scaling."
: Introduced a simplified set of Range Profiling APIs to shield users from low-level complexities. cuda 12.6 update news today
CUDA 12.6 is available for download starting today for Windows 11 and major Linux distributions. NVIDIA has confirmed that the update is fully backward compatible with the RTX 30 and 40 series consumer cards, meaning gaming enthusiasts and independent AI researchers will have immediate access to the new memory optimizations. "Finally, we get better cudaMallocAsync handling," wrote a
NVIDIA has released the latest update to its CUDA toolkit, version 12.6, bringing a host of new features, improvements, and bug fixes. Here's a rundown of the key changes: CUDA 12
: Version 12.6 was the first to use Open drivers by default , which improved transparency but required specific considerations for older Pascal-based cards.
However, some Linux kernel maintainers raised eyebrows at the new driver architecture, noting that the 12.6 kernel module changes are significant. "They are pushing a lot of scheduling logic into user space," noted one kernel developer on a mailing list. "It improves performance, but it’s going to be a headache for distro maintainers."