Environment

Box gives enterprises a new way to build private AI

Box gives enterprises a new way to build private AI

With a trio of new services today, big-data company Cloudera Inc. says it’s striving to help enterprises access their structured and unstructured information more easily to power their artificial intelligence workloads.
The first is a novel “AI-in-a-Box” offering delivered in partnership with Dell Technologies Inc. that gives enterprises a simple solution for storing and accessing all of their data in a secure and compliant way. The second and third come via updates to its flagship data management platform, with the new Cloudera Iceberg REST Catalog and Cloudera Lakehouse Optimizer simplifying the way data is shared and access across computing environments.
Cloudera announced the updates at its annual data and AI conference EVOLVE25, where it’s showcasing how its data architecture is evolving to sit at the intersection of AI and hybrid cloud.
Cloudera’s partnership with Dell aims to tackle one of the biggest challenges faced by enterprises as they race to adopt AI – namely, the problem of how to store and access the data they need to fuel it.
In a new report published today, the company revealed that enterprises still use a mishmash of different storage architectures, with 63% of respondents operating private cloud environments, 52% using public cloud and 42% relying on data warehouses. Because their data is all over the place, organizations are struggling to manage it all properly. That’s proving counterproductive for AI, which needs to be able to access 100% of a company’s data to operate effectively.
Cloudera and Dell’s AI-in-a-Box gives enterprises a simple solution for storing all of their data in one place, where they can run all of Cloudera’s compute engines against Dell’s powerful ObjectScale storage environment — a modern, containerized object storage system that’s designed for high-performance AI. It means enterprises can dump all of their structured and unstructured data in one, extremely secure location where it can be accessed quickly with comprehensive governance rules.
Such a solution is going to appeal to enterprises, because there is a growing sense in the iudustry that any data used for AI ought to be stored on-premises, for reasons including cost, control and security, Steve McDowell of NAND Research Inc. told SiliconANGLE. “Even though we’re comfortable running workloads in the cloud, there’s still a lot of concern about where our most sensitive data runs,” the analyst explained. “This is especially true in regulated industries where compliance and governance are top concerns.”
The offering enables companies to run various AI development tools securely on-premises, including the Cloudera AI Workbench, which provides a secure environment for creating, training and fine-tuning AI models on governed data, plus the Cloudera Inference Service, which handles the deployment of those models after they’ve been trained. Customers can also access the Cloudera Agent Studio, where they can harness the models they’ve built to drive autonomous AI agents that can perform work unsupervised in place of humans.
According to Cloudera, what customers are getting is a comprehensive “Private AI system” that allows them to start building trusted AI workloads faster and with lower operating costs, sidestepping the challenges associated with moving data between different storage environments.
Win-win partnership
McDowell said the partnership makes sense for Dell, because it’s not trying to own the entire AI software stack, but instead lean on its longstanding strategy of letting partners like Nvidia Corp., Hugging Face Inc. and now, Cloudera provide the AI and ML tooling its customers need.
“Dell is focused on supplying the infrastructure foundation only, and in practical terms this is good for its enterprise customers as they get more choice and lower risk through pre-integrated solutions,” the analyst explained. “Customers can pick and choose best-in-class pieces to fit their needs. It’s an approach that allows IT teams to remain in control that keeps Dell relevant, no matter which AI platforms win.”
Cloudera benefits too, McDowell said, because the partnership gives it access to Dell’s dominant enterprise customer base. “It’s just as much of a go-to-market acceleration as it is a technology integration,” he pointed out. “It plants Cloudera firmly in the private AI conversation, where it can now credibly compete with IBM, VMware and even the hyperscaler’s soverign AI narratives.”
Cloudera Chief Strategy Officer Abhas Ricky said the AI-in-a-Box offering should especially interest organizations in regulated industries that need AI systems with ironclad security and clear and predictable costs. “Bringing Dell ObjectScale together with Cloudera enables organizations to industrialize AI use cases using secure data, deploy them efficiently, and create smart agents, all with predictable economics, and without hidden fees,” Ricky said. “This is the quickest and most reliable way for large companies to put AI to work and create intelligent agents.”
Streamlined data access for Iceberg
Of course, not every enterprise wants to stuff all of its AI data into a simple little box. Some would rather keep that information in its existing environments, and that’s where Cloudera’s data management platform updates should come in handy.
With the new Iceberg REST Catalog, Cloudera is facilitating “open interoperability” for Apache Iceberg-based data architectures. It enables them to be shared across different platforms, systems and applications. Meanwhile, the Lakehouse Optimizer does the job of optimizing that information for the different models and engines that need to access it.
Cloudera stressed that moving data between different environments is a major headache for AI builders, because it means transforming it to fit with different data architectures and governance policies. Whenever an organization moves data, it can introduce new security risks and almost always means higher costs and delays in actually accessing that information.
The new offerings are designed to provide a tonic to these headaches, serving as the basis of an open, secure and interoperable new data architecture that allows information to live anywhere. The Iceberg REST Catalog provides the open interoperability, facilitating secure, zero-copy data sharing with unified governance across any cloud or on-premises data center environment, the company said. AI systems can access this data directly, wherever it resides, without needing to copy it or move it first, with consistent policy enforcement.
Meanwhile, the Cloudera Lakehouse Optimizer is a new service that’s meant to automate data optimization and table maintenance for Iceberg environments. It uses AI that can optimize data tables intelligently by rewriting manifests and position deleting files, so companies don’t need to worry about finding human data engineers who can perform these laborious, yet essential tasks.
It’s compatible with any public or private cloud. Cloudera says it enables “unmatched observability and control” over data management, with internal benchmarks showing that it can improve query performance by up to 13 times while reducing storage costs by as much as 36%.
Image: SiliconANGLE/Dreamina AI