
NVIDIA has integrated Groq technologies into its latest Vera Rubin platform. At the core of the system is the Groq 3 LPU chip, featuring 500 MB of SRAM with a bandwidth of 150 TB/s—nearly 7 times faster than the latest HBM4 memory.
Amul Info analysts point out that this solution is designed to radically accelerate token generation (inference). A Groq 3 LPX rack consisting of 256 such processors delivers a combined performance of 315 PFLOPS, making the platform the world's most powerful tool for large language models.
Source: WCCF Tech, NVIDIA, Amul Info
Keywords