Business

Trump’s dangerous mix of low-grade and high-grade deepfakes

Trump’s dangerous mix of low-grade and high-grade deepfakes

Welcome to AI Decoded, Fast Company’s weekly newsletter that breaks down the most important news in the world of AI. I’m Mark Sullivan, a senior writer at Fast Company, covering emerging tech, AI, and tech policy.
This week, I’m focusing on Donald Trump’s recent AI-generated videos, which he (or his staff) posts on Truth Social. I also look at world models, the successor to large language models, and at OpenAI’s new Sora 2 model, which is also the anchor for a new social app.
Sign up to receive this newsletter every week via email here. And if you have comments on this issue and/or ideas for future ones, drop me a line at sullivan@fastcompany.com, and follow me on X (formerly Twitter) @thesullivan.
Trump’s dangerous mix of low-grade and high-grade deepfakes
AI-generated videos are becoming a routine part of politics. That’s probably not a good thing. In the past it’s been fairly easy to distinguish between the real and the generated, but as video-generation models have improved, it’s gotten harder. We’re in a period when both high-quality AI videos and “AI slop” are common.
The Trump administration appears to be using both types. Last weekend, Trump posted an AI video to Truth Social that looked a lot like a real Fox News segment describing a new healthcare program about to be rolled out to Americans. The video featured the president’s daughter-in-law Lara Trump describing “medbeds,” or futuristic health pods (think the health pod in Prometheus) that can do anything from curing cancer to growing back lost limbs. The beds would be offered in new hospitals across the country, and people would use a medbed “card” to access them.
Except magical medbeds are a fantasy, a myth spun up in years of Qanon blather on sites like 4chan. And there’s no new hospital system and no membership card. For most rational people, the unlikelihood of such a thing suddenly coming into existence was the first tip-off that the video was AI generated. Somebody—maybe Trump, maybe a staffer—deleted the strange post a few hours later, and the White House offered no explanation.
A few days later, Trump posted another, lower-quality AI-generated video. This one depicted Senate minority leader Chuck Schumer (D-NY) and House minority leader Hakeem Jeffries (D-NY) standing at a podium talking about the looming government shutdown. The AI had Schumer insulting his own party and using profanity. The AI dropped a cartoon sombrero and a mustache on Jeffries, in a nod to a Republican lie that Democrats will let the government shut down if they can’t give healthcare benefits to undocumented immigrants (who are, by law, ineligible for such benefits).
The Schumer video is obviously AI-generated, meant to troll Democrats during the shutdown impasse. The medbeds video is more troubling because the AI is high-quality and shows an intent to mislead. When one politician routinely uses both “slop” and high-realism AI to make their points, will their constituents or supporters always know the difference? As the AI improves, that becomes almost completely up to the creator of the video. For Trump, a politician with authoritarian tendencies and a reliance on propaganda, that could be a dangerous mix.
AI-based video may wind up becoming the most powerful propaganda tool ever invented. After all, seeing is believing—especially when the viewer wants to believe.
‘World models’ are likely the future of AI
The AI models we’ve been talking about for the past three years are fundamentally language models—really big mathematical probability engines that are good at predicting the most likely word in a sequence. But two things have happened since the appearance of ChatGPT in late 2022: We’ve come to understand that a model that reasons primarily on words is limited in its real-life applications. And within the AI industry, a consensus has formed that the AI labs’ main trick of radically scaling up training data and computing power to make models smarter isn’t achieving the big performance gains it once did.
advertisement
None of this should be too surprising. A machine (or a human) can only learn so much about how the world works by reading articles and books. We humans don’t learn like that. We have a unique ability to quickly build a mental “world model” that organizes diverse modes of information gathered using our senses about our environment and ourselves. Researchers are working hard to develop synthetic world models, but it’s a hard problem. While language models form a vast, many-dimensional vector space (a map, if you will) representing all the possible combinations of words in various contexts, a world model must form a much bigger space to represent the virtually endless combinations of visual, aural, and motion information. While language models try to guess the next word in a sequence, the world model must reason in real time about how a certain action might affect the real world.
“The models are learning how to reason about physical reality,” says Naeem Talukdar, CEO of the video-generation company Moonvalley. A world model inside a robot might be asked to “imagine” a world in which the robotic arm moves 10 degrees to the right, and then judge whether such a move is productive to the larger task at hand. It’s this kind of reasoning that may allow robots to iteratively learn to complete tasks that they’ve not been explicitly trained to do. “The bigger these models get, and the more modalities that they learn on, just like humans, they start to be able to reason on things that they haven’t seen yet before,” Talukdar says. For instance, in the past robots have struggled with the deceptively simple command: “clean up the dinner table and put the dishes in the dishwasher.” Without the ability to reason in real time about the physical world, the robot wouldn’t be able to experimentally move from one micro-task to the next.
World models are also being used to help self-driving cars train for real-world driving, and to iteratively manage unexpected or untrained-for events that occur on the roadway in real time. Additionally, augmented reality glasses such as Meta’s Orion may eventually rely on world models to organize all the data collected from the various sensors in the device, such as motion and orientation sensors, depth and tracking sensors, microphones, and light sensors.
OpenAI’s Sora 2 video generation model comes with a social network
Would you like to open an app and spend your free time watching AI-generated videos depicting real people—including yourself and your friends—doing silly things? OpenAI and its CEO Sam Altman think you will. The company just announced a new version of its impressive Sora video generation model—Sora 2—and a social networking app (iOS only) to go with it.
OpenAI calls Sora 2 the “GPT-3.5 moment for video” (GPT-3.5 was when the output of OpenAI’s language models became more coherent and relevant.) The major breakthrough is native audio-video generation—creating synchronized dialogue, sound effects, and soundscapes that naturally match the visuals. The company says that Sora 2’s understanding of physics is vastly improved, and that it can now accurately model complex phenomena like buoyancy in water, and intricate movements such as gymnastics routines. One person on X demonstrated how Sora 2 can now accurately simulate water being poured into a glass with realistic reflections, ripples, and refraction. The model generates 10-second clips (20 seconds for Pro users) with better consistency across multi-shot sequences, OpenAI says. The model is good at producing realistic, cinematic, and anime styles.
Importantly, a standout Cameos feature lets users insert themselves (or their friends) into videos after a onetime recording. When you set up the Sora app, you’re now asked to record a live video of yourself repeating random numbers or phrases and turning your head in certain poses. All this gives the app a way to authenticate that it’s really you so that only you can use your own image, and to ensure that anybody else trying to use it in a video must get your permission.
Like TikTok, the Sora iOS app (invite-only for now) features an algorithmic feed, content remixing, and social sharing. OpenAI emphasizes “long-term user satisfaction” over engagement metrics, and the company says it has no plans to insert advertising into the feed, as Meta plans to do with its Vibes AI video app.
More AI coverage from Fast Company:
AI’s monstrous energy use is becoming a serious PR problem
ChatGPT can now spend your money for you
Shift the AI conversation from cost cutting to revenue
AI is making your website irrelevant
Want exclusive reporting and trend analysis on technology, business innovation, future of work, and design? Sign up for Fast Company Premium.