AI Chips For The Future

Microchip and Nvidia logo displayed on a phone screen are seen in this multiple exposure illustration photo taken in Krakow, Poland on April 10, 2023. (Photo by Jakub Porzycki/NurPhoto via Getty Images) NurPhoto via Getty Images Since artificial intelligence is remaking industries at a rapid clip, it makes sense to ask: what are the core frameworks powering this sort of change, and what do they need to function? At a fundamental level, these systems need compute. They need specific kinds of compute, and so, they require specific kinds of hardware. We can talk a lot about the digital capabilities of these systems, and in the conferences and events where the experts gather, we do, but it’s also incumbent on the community to better understand the actual physical “brains” of AI models. Ours are made of biological “grey matter,” theirs are made of silicon. GeeksforGeeks explains the role of AI hardware this way: “The parallel processing in AI processors enables LLMs to speed up neural network operations, making applications such as chatbots and generative AI more efficient. This training would take much longer and cost orders of magnitude more, if done with general-purpose processors, like CPUs or even earlier AI chips. So, staying on the frontier of research and deployment is impossible.” Beyond that, all sorts of advanced logic gating, compute colocation and sophisticated engineering allows developers to put more and more into LLM systems. The Market Responds It’s not a coincidence that Nvidia has eclipsed both Apple and Microsoft within the last year, to become King in America’s tech market. The company made its bones at the forefront of the AI hardware revolution. More recently, Qualcomm announced its plan to rival the front-runner in AI chip production, and the company’s stock went up 11%. MORE FOR YOU So the market certainly acknowledges how central AI-specific hardware is for tomorrow’s world. So does the geopolitical community, as Taiwan Semiconductor Manufacturing Company has held the cards for decades, doling out fab production results to companies around the world. Experts Weight In I listened to a recent panel where Forbes white hat Randall Forbes interviewed four experienced innovators on chip design, bottlenecks, opportunities and future forecasting when it comes to AI chips. Lane asked a very useful question, which is this: given that blockchain technologies also required vast amounts of compute, how are innovators using the lessons of the blockchain era to develop AI infrastructure? Note: to answer this question, you will have to have had to be in business for a while. Rajiv Khemani of Auradine explained his firm’s core designs this way: “Power is a function of voltage squared, among a few other things, (like) capacitance and switching and so forth,” he said. “We run these chips at near threshold voltage. So your typical AI chips run at something like .75 volts. We actually run (them) below .3 volts, so we get a v-squared benefit. Now … it's very hard to do, because you have yield issues, reliability issues, and your frequency drops quite dramatically. The only way to do it is through custom circuit design. That, he said, is part of the focus. “We build our own libraries, arithmetic libraries for it,” Khemani added. “We work very closely with foundries, and we are applying that to AI.” Faraj Aalaei of Cognichip framed the problem of semiconductor advancement simply. “Chips take too damn long to design, and they're too damn expensive,” he said, noting that there used to be something like 200 startups working on chips, and that’s now in the single digits each year. So what’s to be done? “What we set out to do here is to collapse that time, and democratize it,” Aalaei said. “ So we are developing a … foundation model ourselves, and crossing that with software development capabilities, kind of like what people talk about in the software industry.” And what people talk about in the software industry, he added, is vibecoding. “You can't vibecode chips,” he noted. “So what we need to do is to create something very similar, but for chip designers. So it's, if you will, an intersection of being something that Anthropic would build, but only for the vertical semiconductors, and something like Cursor would build, but for chip designers. And we're hoping to be able to bring value in that way.” For panelist Seshu Madhavapeddy of Frore systems, it’s all about heat. “If you don't have a very efficient means of removing heat, then you're not going to be able to actually run your data centers at that level of performance,” he said. “This problem is not just in the data centers, it's also at the edge. If you take any consumer device, they have a certain amount of performance, and very often that's dictated by how much heat you can actually remove from the device.” That, he revealed, led to research within his company about solving the heat problem. “What we discovered is that there's a great opportunity to bring in new thinking, into both cooling edge devices, as well as data center devices,” Madhavapeddy said. “And all of us in our company come from a silicon background, semiconductor background. So we've actually seen how the semiconductor industry has completely changed the landscape, in terms of the speed at which you innovate, and also the manufacturing processes and techniques that you have.” He summarized his narrative this way: “Our company is bringing new thinking and creating incredibly new disruptive technologies, but improving cooling both at the edge, and in the data center.” As for Dinesh Maheshwari, an advisor to various companies and former CTO of Groq, he suggested that there is a lot of change to be made. “The compute paradigm is archaic,” he said, suggesting that stakeholders need to minimize overhead and improve memory bandwidth. The Landscape In going over the context of all of these recommendations, the panel talked about go-to-market, buy-in, demand and supply, and the dominance of Nvidia in this space. “When you look at the chip industry, with Nvidia, right when Nvidia came in as the startup, there were already people doing graphic chips, Intel was still the king of the jungle, right?” said Aalaei. “And they found their way of creating that discontinuity. If you look back in the last, I would say, since 2015 that I've been watching Nvidia, this did not happen by accident. They had a thought process about what a GPU could do, and what the future held. And they spent an enormous amount of money as a public company doing this, and they had the courage to do it and stick to it until it's gotten to this point. But I think the dominance of Nvidia in training is going to be hard to beat, so I don't recommend any startups out there trying to unseat them.” Into the Future At the end, Lane asked each panelist to talk about the continuity of Moore’s law, developed in the 1970s, which holds that the number of transistors in a circuit will double each year. “I've been in the business for over 40-some odd years, which, for the rest of you, means I started when I was in kindergarten,” Aalaei said, noting the enduring nature of change. “I’ve always thought, whatever I was in, ‘this is the end of it.’ And then, of course, there's innovation, right, ongoing.” The future, he suggested, is with bespoke chips that are more agile and versatile, with less overhead requirements. “I think people need to do chips faster, and therefore when you can make (them) the most bespoke, lower power and more directly to the point, I think that's where it's going,” he said. “The slowing in Moore’s law will be compensated with, say, the packaging I mentioned, also with other kinds of computing units, optical, for example, playing into that as well,” Maheshwari said. “Moore’s law was a brute force way of pushing things for a fixed architecture paradigm, or a fixed packaging. It is forcing us to look at the other dimension, and there's enough room to innovate.” “Moore's law has not slowed down, because you're still able to pack twice the number of transistors in the same die area as you could every two years,” Madhavapeddy said. “You can keep going. But what's changed is you're not getting power production. So … you can pack (in) twice the number of transistors, but now they consume twice the amount of power. That was not the case previously.” He covered some ideas on solving this. “How do you make sure that you can continue to deliver higher performance?” Madhavapeddy asked. “By innovating in common solutions, so that all that extra heat generated by the extra power that's been consumed will be dissipated effectively, without compromising on that density.” “We’ve done chiplet interface standardization to expand … and build bigger and bigger chips to address some of the challenges that we've seen,” Khemani added. “Cooling is another aspect to it, and as people know, we've moved from general purpose computing to accelerated computing, and then acceleration for specific applications is the way to go, and software will have to adjust to take advantage of all of this.” All of this shows us what people are thinking behind the scenes as we head toward another ground-breaking year for hardware and everything else. Stay tuned. Editorial StandardsReprints & Permissions

AI Chips For The Future

Guess You Like