Technology

Researchers uncover hidden ingredients behind AI creativity

AnyFans 2025-09-28

By Webb Wright

Close main menu

Live Science

View Profile

Search Live Science

Planet Earth

Archaeology

Physics & Math

Human Behavior

Science news

Life’s Little Mysteries

Science quizzes

Newsletters

Story archive

Man regains vision by tooth to eye implant
5,000-year-old stone tomb in Spain
Abandoning daylight saving time
JWST discovers possible black hole star
Million-year-old skull from China rewrites human origins

Don’t miss these

Artificial Intelligence
Would outsourcing everything to AI cost us our ability to think for ourselves?

Artificial Intelligence
Why OpenAI’s solution to AI hallucinations would kill ChatGPT tomorrow

Artificial Intelligence
AI can’t solve these puzzles that take humans only seconds

Artificial Intelligence
Scientists asked ChatGPT to solve a math problem from more than 2,000 years ago — how it answered it surprised them

Artificial Intelligence
AI could use online images as a backdoor into your computer, alarming new study suggests

Artificial Intelligence
AI outsmarted 30 of the world’s top mathematicians at secret meeting in California

Artificial Intelligence
AI could soon think in ways we don’t even understand — evading our efforts to keep it aligned — top AI scientists warn

Artificial Intelligence
AI models can send subliminal messages that teach other AIs to be ‘evil,’ study claims

Artificial Intelligence
The more advanced AI models get, the better they are at deceiving us — they even know when they’re being tested

This obscure, 80-year-old machine might be the key to unlocking the full potential of AI today

Artificial Intelligence
AI chatbots oversimplify scientific studies and gloss over critical details — the newest models are especially guilty

Artificial Intelligence
OpenAI’s ChatGPT agent can control your PC to do tasks on your behalf — but how does it work and what’s the point?

Artificial Intelligence
New AI system can ‘predict human behavior in any situation’ with unprecedented degree of accuracy, scientists say

Quantum Physics
‘Paraparticles’ would be a third kingdom of quantum particle

Artificial Intelligence
There are 32 different ways AI can go rogue, scientists say — from hallucinating answers to a complete misalignment with humanity

Artificial Intelligence

Researchers uncover hidden ingredients behind AI creativity

Webb Wright

27 September 2025

Image generators are designed to mimic their training data, so where does their apparent creativity come from? A recent study suggests that it’s an inevitable by-product of their architecture.

When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

(Image credit: Adrián Astorgano for Quanta Magazine)

We were once promised self-driving cars and robot maids. Instead, we’ve seen the rise of artificial intelligence systems that can beat us in chess, analyze huge reams of text and compose sonnets. This has been one of the great surprises of the modern era: physical tasks that are easy for humans turn out to be very difficult for robots, while algorithms are increasingly able to mimic our intellect.

Another surprise that has long perplexed researchers is those algorithms’ knack for their own, strange kind of creativity.
Diffusion models, the backbone of image-generating tools such as DALL·E, Imagen and Stable Diffusion, are designed to generate carbon copies of the images on which they’ve been trained. In practice, however, they seem to improvise, blending elements within images to create something new — not just nonsensical blobs of color, but coherent images with semantic meaning. This is the “paradox” behind diffusion models, said Giulio Biroli, an AI researcher and physicist at the École Normale Supérieure in Paris: “If they worked perfectly, they should just memorize,” he said. “But they don’t — they’re actually able to produce new samples.”

You may like

Would outsourcing everything to AI cost us our ability to think for ourselves?

Why OpenAI’s solution to AI hallucinations would kill ChatGPT tomorrow

AI can’t solve these puzzles that take humans only seconds

To generate images, diffusion models use a process known as denoising. They convert an image into digital noise (an incoherent collection of pixels), then reassemble it. It’s like repeatedly putting a painting through a shredder until all you have left is a pile of fine dust, then patching the pieces back together. For years, researchers have wondered: If the models are just reassembling, then how does novelty come into the picture? It’s like reassembling your shredded painting into a completely new work of art.

Now two physicists have made a startling claim: It’s the technical imperfections in the denoising process itself that leads to the creativity of diffusion models. In a paper that will be presented at the International Conference on Machine Learning 2025, the duo developed a mathematical model of trained diffusion models to show that their so-called creativity is in fact a deterministic process — a direct, inevitable consequence of their architecture.
By illuminating the black box of diffusion models, the new research could have big implications for future AI research — and perhaps even for our understanding of human creativity. “The real strength of the paper is that it makes very accurate predictions of something very nontrivial,” said Luca Ambrogioni, a computer scientist at Radboud University in the Netherlands.
Mason Kamb, a graduate student studying applied physics at Stanford University and the lead author of the new paper, has long been fascinated by morphogenesis: the processes by which living systems self-assemble.

Sign up for the Live Science daily newsletter now
Get the world’s most fascinating discoveries delivered straight to your inbox.
Contact me with news and offers from other Future brandsReceive email from us on behalf of our trusted partners or sponsorsBy submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.
One way to understand the development of embryos in humans and other animals is through what’s known as a Turing pattern, named after the 20th-century mathematician Alan Turing. Turing patterns explain how groups of cells can organize themselves into distinct organs and limbs. Crucially, this coordination all takes place at a local level. There’s no CEO overseeing the trillions of cells to make sure they all conform to a final body plan. Individual cells, in other words, don’t have some finished blueprint of a body on which to base their work. They’re just taking action and making corrections in response to signals from their neighbors. This bottom-up system usually runs smoothly, but every now and then it goes awry — producing hands with extra fingers, for example.
When the first AI-generated images started cropping up online, many looked like surrealist paintings, depicting humans with extra fingers. These immediately made Kamb think of morphogenesis: “It smelled like a failure you’d expect from a [bottom-up] system,” he said.
AI researchers knew by that point that diffusion models take a couple of technical shortcuts when generating images. The first is known as locality: They only pay attention to a single group, or “patch,” of pixels at a time. The second is that they adhere to a strict rule when generating images: If you shift an input image by just a couple of pixels in any direction, for example, the system will automatically adjust to make the same change in the image it generates. This feature, called translational equivariance, is the model’s way of preserving coherent structure; without it, it’s much more difficult to create realistic images.

You may like

Would outsourcing everything to AI cost us our ability to think for ourselves?

Why OpenAI’s solution to AI hallucinations would kill ChatGPT tomorrow

AI can’t solve these puzzles that take humans only seconds

In part because of these features, diffusion models don’t pay any attention to where a particular patch will fit into the final image. They just focus on generating one patch at a time and then automatically fit them into place using a mathematical model known as a score function, which can be thought of as a digital Turing pattern.
Researchers long regarded locality and equivariance as mere limitations of the denoising process, technical quirks that prevented diffusion models from creating perfect replicas of images. They didn’t associate them with creativity, which was seen as a higher-order phenomenon.
They were in for another surprise.
Made locally
Kamb started his graduate work in 2022 in the lab of Surya Ganguli, a physicist at Stanford who also has appointments in neurobiology and electrical engineering. OpenAI released ChatGPT the same year, causing a surge of interest in the field now known as generative AI. As tech developers worked on building ever-more-powerful models, many academics remained fixated on understanding the inner workings of these systems.

Mason Kamb (left) and Surya Ganguli found that the creativity in diffusion models is a consequence of their architecture. (Image credit: Charles Yang)
To that end, Kamb eventually developed a hypothesis that locality and equivariance lead to creativity. That raised a tantalizing experimental possibility: If he could devise a system to do nothing but optimize for locality and equivariance, it should then behave like a diffusion model. This experiment was at the heart of his new paper, which he wrote with Ganguli as his co-author.
Kamb and Ganguli call their system the equivariant local score (ELS) machine. It is not a trained diffusion model, but rather a set of equations which can analytically predict the composition of denoised images based solely on the mechanics of locality and equivariance. They then took a series of images that had been converted to digital noise and ran them through both the ELS machine and a number of powerful diffusion models, including ResNets and UNets.
The results were “shocking,” Ganguli said: Across the board, the ELS machine was able to identically match the outputs of the trained diffusion models with an average accuracy of 90% — a result that’s “unheard of in machine learning,” Ganguli said.
The results appear to support Kamb’s hypothesis. “As soon as you impose locality, [creativity] was automatic; it fell out of the dynamics completely naturally,” he said. The very mechanisms which constrained diffusion models’ window of attention during the denoising process — forcing them to focus on individual patches, regardless of where they’d ultimately fit into the final product — are the very same that enable their creativity, he found. The extra-fingers phenomenon seen in diffusion models was similarly a direct by-product of the model’s hyperfixation on generating local patches of pixels without any kind of broader context.
Experts interviewed for this story generally agreed that although Kamb and Ganguli’s paper illuminates the mechanisms behind creativity in diffusion models, much remains mysterious. For example, large language models and other AI systems also appear to display creativity, but they don’t harness locality and equivariance.
“I think this is a very important part of the story,” Biroli said, “[but] it’s not the whole story.”
Creating creativity
For the first time, researchers have shown how the creativity of diffusion models can be thought of as a by-product of the denoising process itself, one that can be formalized mathematically and predicted with an unprecedentedly high degree of accuracy. It’s almost as if neuroscientists had put a group of human artists into an MRI machine and found a common neural mechanism behind their creativity that could be written down as a set of equations.