Entertainment

AI Can Make Pictures. It Still Can’t Hold a World | Opinion

AI Can Make Pictures. It Still Can’t Hold a World | Opinion

Despite what many Silicon Valley AI founders keep hyping, you can’t auto generate a hit show or bottle cinematic magic. My offer: story structure first, generation second. We need tools that codify human intent. We need the why behind a character, the cause and effect behind plot choices, and the rules of a world only human ingenuity does best. We need tools that organize and structure creative data to preserve and empower the strongest engine in storytelling—the human mind.
Luma AI CEO Amit Jain recently called today’s video models (including his own) “good pixel generators” and “quite dumb,” going on to say that real cinema needs a built-in narrative engine that knows how tension builds and where jokes land. Just a week before, Everything Everywhere All at Once filmmaker Daniel Kwan called for a unified, unprecedented push to set the terms for AI’s role in filmmaking. Together they mark the bottleneck: quality picture generation is sprinting ahead while story sense lags far, far behind. Governance will help, but it isn’t enough. We need tools that understand and hold the nuance of great narrative: human-defined story design, codified as a blueprint that tools must work within, not a playground for hallucinatory liberties. That is the difference between quick output and durable worlds.
I’ve spent decades inside primetime animation, running Bento Box Entertainment as we produced for every major network and streamer. On real productions, story is not just visual scenery. It’s the plan the entire crew follows. If a network exec suggests changing a character’s defining attributes, shifting a relationship dynamic or adjusting the series tone, you don’t just tweak a line in the dialogue. You rewrite, re-animate, and re-cut across departments. What looks like a small fix turns into weeks of work and millions in overages. Humans can feel when a beat is wrong and pull it back on course. Models like ChatGPT or Claude do not, and at scale they turn drift into a habit. Protect the blueprint, and you protect the business.
Screenwriters have been saying the same thing; that these generative AI models can spit out scenes that look plausible, but they don’t really understand plot, motivation, or theme. Recent research confirms that these models mimic patterns without internalizing world rules and falter as complexity rises. That is exactly why a human-authored blueprint is needed. That is the risk whether you’re protecting a studio’s crown jewel franchise or a writer’s new original idea. Without a clear, enforceable story map, the world unravels.
New “AI TV” tools let users spin up scenes and whole episodes on demand but they struggle to carry a story beyond a single episode. If you want scenes to add up across a season or a film arc, you cannot rely on vibes. The rules of the world can’t live in a PDF. They have to be structured like data, with relationships and constraints the system can check and scale over time. Tone and safety guardrails are useful, but they are not a narrative engine. A machine-readable story bible is needed.
Set the Blueprint Standard
What if we could treat story like an API contract? What if we could write the rules as data? Studios and creators should set IP-specific, tool-neutral blueprint standards and require any AI system to check against them before work moves forward. The rules must be machine-readable and applied consistently.
The blueprint covers non-negotiables: character voice, limits, wants, fears. It defines the world so everyone knows what can and cannot happen. It maps causality by stating which beats unlock later beats and which promises and payoffs cannot be skipped. And it carries the theme forward by naming what the story is about and what each plot owes to that idea.
This isn’t just theory. In production, a small break in the story or series bible forces rework across writing, design, performance, edit, and marketing. Add fast AI generation without a map into the mix and you increase not only speed, but the chance of quiet rule breaks that compound over time.
Good Craft, Good Business
Franchise IP is the pension plan of this business. Value compounds only if the world holds over time and across formats. A structured blueprint makes universes portable. You can move a property from live-action to animation to games without losing its spine.
This is not an argument against AI. Quite the opposite. It is a sequence. Give the tools an intelligence layer and they become reliable collaborators. Skip that step and speed becomes a liability. The outputs will look convincing, right up until the audience sees the world doesn’t add up.
I’ve seen what happens when late notes break the map. I’ve also seen large multi-department teams move fast when the blueprint is clear. The choice is not abstract. Let models improvise within structured worlds, or watch them erode years of work.
Scott Greenberg is an Emmy award-winning producer and co-founder and executive chairman of Othelia Technologies. Previously, he was co-founder and CEO of Bento Box Entertainment, where he oversaw production of hit shows including Bob’s Burgers, Hazbin Hotel, and Central Park, and president/COO of Film Roman, where he oversaw the animation studio that produced The Simpsons, The Simpsons Movie, and King of the Hill. He is a member of The Producers Guild of America and The Television Academy.
The views expressed in this article are the writer’s own.