Why high-bandwidth memory is a bottleneck for AI chips
High-bandwidth memory keeps powerful AI chips fed with data, and demand for it helped Boise-based Micron briefly top $1 trillion For decades, Micron Technology made one of computingโs less glamorous essentials: memory chips . Then the artificial intelligence boom made that hardw
High-bandwidth memory keeps powerful AI chips fed with data, and demand for it helped Boise-based Micron briefly top $1 trillion
For decades, Micron Technology made one of computingโs less glamorous essentials: memory chips . Then the artificial intelligence boom made that hardware one of the industryโs most sought-after components. Technology companies are now scrambling for high-bandwidth memory, or HBM; Micron specializes in it. This week, the Boise-based company became the first U.S. memory-chip company to briefly top $1 trillion in market valueโa milestone that points to a larger shift in the AI supply chain.
AI systems depend on fast processors, but also on how quickly data can reach them and remain accessible. HBM is designed to do just that. โThe reason HBMs are in such high demand is that they have pretty good storage, and theyโre extremely, extremely fast,โ says Keren Bergman, an electrical engineering professor at Columbia University.
HBM chips are built differently from the memory inside a laptop or phone. Instead of spreading memory chips across a board, HBM stacks layers of memory vertically and places them close to the processor. The arrangement gives AI accelerators a much wider path to the data they need. Micron says its HBM4 chips can reach more than 2.8 terabytes per second of bandwidth and are designed for Nvidiaโs next-generation Vera Rubin GPUs.
If you're enjoying this article, consider supporting our award-winning journalism by subscribing . By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
Within a computer, memory chips and processors are like buildings connected with highways. There are only so many ways to widen those roads. Engineers can make memory faster only to a point, and they can add only so many physical connections between memory and processors, says Hadi Esmaeilzadeh, a computer architecture researcher at UC San Diego. The innovation of high-bandwidth memory is to stack the buildings 12 or even 16 layers high, with the layers connected by through-silicon vias, or TSVs, so that GPU processors and other accelerators can reach more memory in a given time. โNow thereโs higher connectivity between the two, providing higher bandwidth. Itโs like adding more lanes on highways,โ Esmaeilzadeh says.
The demand is coming from both sides of the AI business. Training large models requires huge clusters of accelerators. Running those models for users, whether in chatbots, coding tools , or future AI agents , also requires moving enormous amounts of data, again and again. And a GPU waiting for data is wasted hardware.
Bandwidth is only part of the problem. As large language models expand, capacity becomes a challenge too, even with top-of-the-line HBM chips. โBecause of the growing size of AI models, the available memory capacity you have close by is one or two orders of magnitude less than what you need,โ Bergman says. Memory has become one of the central limits on advanced AI hardware. (Micron declined Scientific Americanโs requests for comment.)

