Exascale computing infrastructure insights on theCUBE
Exascale computing infrastructure insights on theCUBE
Homepage   /    technology   /    Exascale computing infrastructure insights on theCUBE

Exascale computing infrastructure insights on theCUBE

🕒︎ 2025-11-06

Copyright SiliconANGLE News

Exascale computing infrastructure insights on theCUBE

Exascale computing systems are no longer theoretical milestones on research roadmaps. They’re operational infrastructure handling workloads that stress every layer of the stack, from interconnects to observability tooling. The challenge isn’t just building faster processors or denser storage; it’s ensuring that the infrastructure meant to secure and monitor these exascale platforms doesn’t become the bottleneck. As SC25 brings together researchers and industry leaders, the gap between raw computational capability and usable scientific output remains a critical challenge. Building the fastest networks, massive storage tiers and exascale computing infrastructure is only part of the equation. Organizations must also create workflows and platforms that convert that capability into operational discoveries, according to theCUBE Research’s Paul Nashawaty. “When we talk about platform engineering … we’re really trying to understand how organizations are fostering this culture of developer innovation to accelerate software delivery but also stay competitive in today’s fast-paced market,” he said. “At SC25, research and industry must not only build the fastest networks, such as SCINet, storage tiers and exascale compute, but also build workflows and platforms that let users convert that capability into real discoveries. That means focusing on platform engineering, agile workflows and enabling innovation.” Join theCUBE, SiliconANGLE Media’s livestreaming studio, from Nov. 18–20, for exclusive coverage of SC25. TheCUBE’s interviews will explore how researchers and technology leaders are turning exascale computing breakthroughs into operational reality — from optimizing infrastructure for performance and efficiency to scaling AI-driven discovery across scientific domains. (* Disclosure below.) How storage innovation addresses exascale computing demands Exascale computing infrastructure demands storage systems capable of supporting massive GPU clusters while maintaining the throughput and low latency required for AI training and simulation workloads. DataDirect Networks Inc. anchors high-performance storage for these environments with its EXAScaler platform, which recently completed Nvidia Corp.’s certification program for scalable data storage. The platform supports graphics processing unit deployments exceeding 100,000 units, with proven throughput characteristics that address the time-to-first-token performance critical for large-scale AI operations, according to Balaji Venkateshwaran, vice president of AI product management at DataDirect Networks, who spoke with theCUBE. A growing number of enterprises are also turning to WekaIO Inc., whose software-defined Weka Data Platform is designed for AI-native performance at scale across on-premises, cloud and hybrid environments. “We’re future-proofing the environment by looking at where customers are actually going to be in the next two, three, four years and making sure that Weka can actually accommodate that,” said Shimon Ben-David, chief technology officer of Weka, in an interview with theCUBE. Built around a unified, low-latency file system optimized for GPUs, Weka’s architecture supports both structured and unstructured data and addresses one of the central bottlenecks in exascale AI workflows — data movement and orchestration. Ben-David noted that Weka has developed a reference architecture to help customers navigate “a lot of moving parts, a lot of frameworks, orchestration, [and] data challenges, whether you are scaling or not,” as AI workloads evolve. Storage density and power efficiency challenges will likely feature prominently at SC25 as exascale systems push infrastructure to its limits. Solidigm, a trademark of SK Hynix NAND Products Solutions Corp., addresses these constraints with its 122TB solid-state drives through high-density NAND technology designed to deliver higher endurance, power efficiency and density at the drive level, according to Ace Stryker, director of market development at Solidigm. The company’s SSDs also deliver measurable performance gains in retrieval-augmented generation pipelines, achieving 70% increased queries per second compared to traditional memory-based solutions while reducing memory footprint by 50%, added Tahmid Rahman, director of product and partner marketing at Solidigm, during an interview with theCUBE. “With AI workloads surging and network traffic spiking by over 200% in just a year, traditional security and observability tools are failing to keep pace,” Nashawaty said. “As exascale platforms and ultra-high-speed networks … become operational, the massive I/O and interconnect demands mean that ‘classic’ monitoring tools won’t suffice. Converged high-performance computing/AI infrastructure requires fresh approaches to visibility, telemetry and security.” System integration challenges will be central to SC25 discussions as organizations work to translate storage density into operational performance. Dell’s PowerScale storage platform recently added support for Nvidia’s GB200 and GB300 NVL72 rack-scale configurations, with the PowerScale F710 achieving Nvidia Cloud Partner certification for high-performance storage. The F710 delivers GPU-scale performance with up to 72% lower power usage and five times greater space efficiency than competitors, addressing the physical constraints of exascale computing deployments, according to Dell. “The walls between HPC, cloud and AI are collapsing,” said theCUBE Research’s Chief Analyst Dave Vellante. “What’s emerging is an accelerated infrastructure stack that’s engineered for manufacturing intelligence.” Platform engineering converts capability into discovery Disaggregated storage architectures will be a key discussion point at SC25 as organizations seek to support massive GPU clusters. Vast Data Inc.’s shared-everything architecture separates storage media from processors, enabling the platform to support hundreds of thousands of GPUs in a single cluster while handling exabyte-scale deployments, according to Phil Manez, GTM execution lead at Vast Data. The company’s AI Operating System allows users to capture both structured and unstructured data and contextualize it for AI workloads. “We founded our whole company on this disaggregated, shared-everything architecture,” Manez told theCUBE. “We have to be able to support hundreds of thousands of GPUs in a single cluster, and we are literally seeing exabyte-scale deployments in single clusters now as we’re bringing together all of this data.” Multicloud and hybrid infrastructure strategies will be part of the SC25 conversation as organizations balance performance with compliance requirements. Cloud infrastructure providers such as Vultr, a registered trademark of The Constant Company, are positioning security as a foundational requirement for AI workloads at scale. Vultr achieved FedRAMP compliance after a multi-year effort, establishing government-grade security standards for its AI infrastructure, according to Kevin Cochrane, chief marketing officer of Vultr. The company’s platform now reaches 90% of the world’s population within two milliseconds and supports early AI infrastructure investments. “We’ve been working on our AI infrastructure since the early days,” Cochrane told theCUBE. “We were the first to fractionalize the [Nvidia] A100. We were one of the first to market with the H100. And we’ve been building the most resilient, secure, compliant AI infrastructure at global scale ever since.” As these capabilities expand, attention at SC25 also turns toward how teams translate scale into outcomes. Platform engineering has emerged as the discipline that converts infrastructure capability into operational discoveries, according to Nashawaty. It’s a shift that focuses on workflow autonomy, developer satisfaction and the ability to move discoveries from research to deployment. “Ninety-two percent of developers need modern tools and platforms to innovate, and teams with high developer satisfaction and autonomy can deploy 23% more frequently,” Nashawaty said. “When simulation, large-scale analytics and AI models integrate on exascale systems, the developer and researcher experience becomes critical. Deployment frequency, tooling and workflow autonomy matter for turning raw compute and storage into operational discovery.” TheCUBE event livestream Do not miss theCUBE’s coverage of the SC25 event from Nov. 18-20. Plus, you can watch theCUBE’s event coverage on-demand after the event. How to watch theCUBE interviews We offer you various ways to watch theCUBE’s coverage of SC25, including theCUBE’s dedicated website and YouTube channel. You can also get all the coverage from this year’s events on SiliconANGLE. TheCUBE podcasts SiliconANGLE’s “theCUBE Pod” is available on Apple Podcasts, Spotify and YouTube, which you can enjoy while on the go. During each podcast, SiliconANGLE’s John Furrier and Dave Vellante unpack the biggest trends in enterprise tech — from AI and cloud to regulation and workplace culture — with exclusive context and analysis. SiliconANGLE also produces our weekly “Breaking Analysis” program, where Dave Vellante examines the top stories in enterprise tech, combining insights from theCUBE with spending data from Enterprise Technology Research, available on Apple Podcasts, Spotify and YouTube. Guests During the SC25 event, theCUBE’s coverage will feature leaders from DataDirect Networks, Solidigm, Vast Data, Dell and Vultr, alongside other industry experts advancing exascale computing and AI infrastructure. They’ll share insights on how organizations are optimizing performance, balancing efficiency and security, and pushing the boundaries of scientific discovery at scale. Watch our live coverage to hear from industry experts, including: Steen Graham, chief executive officer of Metrum AI Inc.; Ace Stryker, director of AI and ecosystem marketing at Solidigm; Jeff Denworth, co-founder of Vast Data Inc.; Jonathan Ballon, CEO of Iceotope Technologies Ltd.; Alex Bouzari, CEO of DataDirect Networks Inc.; Lauren Witter, vice president of sales at CoolFlow, a Divison of Omni Services Inc.; and Andy Pernsteiner, field chief technology officer of Vast Data. (* Disclaimer: TheCUBE is a paid media partner for the SC25 event. The sponsors of theCUBE’s event coverage do not have editorial control over content on theCUBE or SiliconANGLE.) Image: SiliconANGLE

Guess You Like

In UK, China is a threat and a coveted trade partner
In UK, China is a threat and a coveted trade partner
Britain's electronic surveilla...
2025-10-28
What to know about the explosion at Harvard Medical School
What to know about the explosion at Harvard Medical School
The blast on the research buil...
2025-11-05
ZCDC records 4,5 mln fatality-free shifts  
ZCDC records 4,5 mln fatality-free shifts  
Subscribe Today Sunday, Oct...
2025-10-28