Technology

How microfluidics could solve AI’s overheating crisis in power-hungry data centers

How microfluidics could solve AI's overheating crisis in power-hungry data centers

One of the major reasons why artificial intelligence data centers are sucking up so much power is the need to cool processors that run very hot. But Microsoft Corp. is trying out a possible solution: sending fluid directly through tiny channels etched into the chips.
The technology is called microfluidics, and it’s being used in prototype systems in test conditions at the company, said Husam Alissa, who oversees Microsoft’s systems technology. The technique has been applied to server chips used for Office cloud apps and the graphics processing units that handle AI tasks, he said.
Because the cooling fluid is applied directly to chips, it can be a relatively high temperature — as much as 158 degrees Fahrenheit in some cases — while still being effective. The company demonstrated the technology under a microscope last week at its campus in Redmond, Wash., saying that testing so far has shown significant improvements over conventional approaches. Cooling in this way could also let Microsoft develop more powerful chips by stacking them on top of each other.
The technology is part of a broader effort to customize hardware in Microsoft’s data centers, which are expanding rapidly. In the last year, the company has added more than 2 gigawatts’ worth of capacity.
“When you are operating at that scale, efficiency is very important,” Rani Borkar, vice president for hardware systems and infrastructure at Microsoft’s Azure data center arm, said in an interview.
The new cooling technology can also let Microsoft deliberately overheat chips in order to get better performance. Called overclocking, this can be useful for handling brief surges in demand. For example, Microsoft’s Teams conferencing software experiences spikes in use around the hour-and-a-half mark because that’s when most meetings begin.
Instead of using more chips, the company could just overclock ones for a few minutes, said Jim Kleewein, a Microsoft technical fellow who works with the hardware team on filling the needs of its Office software products.
The company is also more widely deploying hollow-core fiber for networking to increase data transmission speeds. This approach uses air to deliver data rather than the traditional glass core.
At the Microsoft lab, a piece of the material the size of a few inches can be stretched to connect several kilometers, said Jamie Gaudette, who works on cloud network engineering. The software giant has teamed up with Corning Inc. and Heraeus Covantics to boost production of the material.
Microsoft is also aiming to develop hardware for memory chips, but hasn’t yet unveiled any plans, Borkar said.
“There are things happening on the memory side, but they are not to a point where we want to discuss it,” she said. “Memory is something I can’t ignore.”
A key industry focus is high-bandwidth memory, or HBM, a component used in AI computing that is made by companies such as Micron Technology Inc. Right now, Microsoft’s Maia AI chip — overseen by Borkar — relies on commercially available HBM. It’s a vital technology, she said.
“Today, HBM is the end-all and be-all,” Borkar said. “What’s going to happen next? We are looking at all of that.”