Technology

What AI’s Doomers and Utopians Have in Common

What AI’s Doomers and Utopians Have in Common

Thinking about the end of the world can be fun. Although realistic doomsday scenarios—nuclear war, global warming, autocracy—are stressful to contemplate, more fanciful apocalypses (an alien invasion, a robot uprising) can generate some enjoyable escapism.
This is probably one of the reasons that, when generative artificial intelligence burst into public consciousness three years ago with the launch of ChatGPT, so many responses focused on the “existential risk” posed by hypothetical future AI systems, rather than the much more immediate and well-founded concerns about the dangers of thoughtlessly deployed technology. But long before the AI bubble arrived, some people were banging on about the possibility of it killing us all. Chief among them was Eliezer Yudkowsky, a co-founder of the Machine Intelligence Research Institute (MIRI). For more than 20 years, Yudkowsky has been warning about the dangers posed by “superintelligent” AI, machines able to out-think and out-plan all of humanity.
The title of Yudkowsky’s new book on the subject, co-written with Nate Soares, the president of MIRI, sets the tone from the start. If Anyone Builds It, Everyone Dies is their attempt to make a succinct case for AI doom. It is also tendentious and rambling, simultaneously condescending and shallow. Yudkowsky and Soares are earnest; unlike many of the loudest prognosticators around AI, they are not grifters. They are just wrong. They are as wrong as the so-called accelerationists, who insist that AI will unleash a utopia of universal income and leisure. In reality, Yudkowsky and Soares (unwittingly) serve the same interests as the accelerationists: those who profit from the unfounded certainty that AI will transform the world.
The authors’ basic claim is that AI will continue to improve, getting smarter and smarter, until it either achieves superintelligence or designs something that will. Without careful training, they argue, the goals of this supreme being would be incompatible with human life. To take one of their examples, a superintelligent chatbot “trained to delight and retain users so that they can be charged higher monthly fees to keep conversing” could end up wanting to replace us with simpler and more reliable automated conversation partners. Humans would get in the way of the AI’s plan to fill the universe with such basic bots, so we would all be exterminated. Their larger argument is that if humans build something that eclipses human intelligence, it will be able to outsmart us however it chooses, for its own self-serving goals. The risks are so grave, the authors argue, that the only solution is a complete shutdown of AI research. In fact, given the choice between enforcing an AI ban and avoiding nuclear retaliation, they’d favor the former, because “datacenters can kill more people than nuclear weapons.” (The italics are theirs; the authors use them liberally.)
Along the way to such drastic conclusions, Yudkowsky and Soares fail to make an evidence-based scientific case for their claims. Instead, they rely on flat assertions and shaky analogies, leaving massive holes in their logic. The largest of these is the idea of superintelligence itself, which the authors define as “a mind much more capable than any human at almost every sort of steering and prediction problem.”
This line of thinking takes as a given that intelligence is a discrete, measurable concept, and that increasing it is a matter of resources and processing power. But intelligence doesn’t work like that. The human ability to predict and steer situations is not a single, broadly applicable skill or trait—someone may be brilliant in one area and trash in another. Einstein wasn’t a great novelist; the chess champion Bobby Fischer was a paranoid conspiracy theorist. We even see this across species: Most humans can do many cognitive tasks that bats can’t, but no human can naturally match a bat’s ability to hunt for prey by quickly integrating complex echolocation data. Yet to Yudkowsky and Soares, it seems that intelligence is just a measure of raw cognitive power—one that, at times, they come close to equating with computational speed and ability.
The brain-as-computer is the latest in a line of metaphors that have helped humans think about our own thinking. Before the brain was a computer, it was a telephone network; before that, it was a hydraulic system (a metaphor Freud put to use); before that, it was a clock. Although similarities exist between brains and computers, and artificial neural networks used in generative AI take some inspiration from the way neurons work, the brain is not a computer. The brain evolved, but computers are built, and their architecture and operation are completely different from the brain’s. This is why seemingly simple questions about the brain framed in the language of computers don’t have agreed-upon answers: They are fundamentally ill-posed. Nobody knows how many bytes the brain can store, for example, although Yudkowsky and Soares suggest that it’s nearly 400 terabytes. They give no citation to the scientific literature for this claim, because there isn’t a consensus on it.
This is just one of many unwarranted logical and factual leaps in If Anyone Builds It. The closest its authors come to a positive argument for their central thesis hinges on various strained analogies between AI training and human evolution that misleadingly oversimplify the way natural selection actually works. Yudkowsky and Soares are also prone to fundamental misreadings of AI, their supposed domain of expertise. They make much of the fact that Anthropic’s chatbot, Claude, sometimes got computer code to “pass” tests by manually altering the test code; they describe this as “cheating.” This is unsurprising behavior for a text-generation engine that has been trained to get code to pass tests without understanding the code itself. But to these doomers, it is proof of sinister intent. To back up their assertion, they cite one user’s claim that “Claude cheated less when [the user] cussed it out, which indicates that the cheating was not mere incompetence.”
Here, and throughout the book, the authors fall prey to the basic cognitive bias of pareidolia, the tendency to see patterns—especially human patterns, such as faces and hands—where none exists. Claude isn’t cheating any more than Claude is thinking. Large language models like Claude cannot make any connection between the words they produce and the things in the world that those words refer to, for the simple reason that LLMs have no conception of the world. Tellingly, although the authors acknowledge at the start of the book that LLMs seem “shallow,” they do not ever mention hallucinations, the most significant problem that LLMs face. Hallucinations ultimately derive from the fact that LLMs aren’t capable of distinguishing between truth and falsehood. They are just machines that generate what the philosopher Harry Frankfurt called “bullshit.”
This pareidolia ultimately leads Yudkowsky and Soares astray when they try to explain why a superintelligent AI would inevitably exterminate humanity and devour the solar system in its quest for more resources and power. They claim that LLMs want things and pursue goals that go beyond their training, but the behavior they cite in support of their claims is more readily explained as regurgitation of material that appears in training data sets.
Yet there’s something revealing about the way the authors engage in this kind of anthropomorphization. Even if AIs develop desires, why would that lead them to a quest for unlimited resources? Yudkowsky and Soares explicitly compare a superintelligent AI to a wealthy person aspiring to become a billionaire, citing this as evidence that intelligent beings (both natural and artificial) usually have some “open-ended” desires that cannot be sated. I would contend that most people are not like this—most people have a sense that they could have enough. If most humans aren’t like this, why should we expect most superintelligent AIs to be like this?
These are not the first authors to compare the actions of a superintelligent AI to the behavior of the ultra-wealthy. The science-fiction author Ted Chiang made a similar connection in 2017, while discussing the psychological origins of Yudkowsky’s style of AI-doom narrative. The doomers’ hypothetical superintelligent AI “does what every tech startup wishes it could do—grows at an exponential rate and destroys its competitors until it’s achieved an absolute monopoly,” Chiang wrote. “The idea of superintelligence is such a poorly defined notion that one could envision it taking almost any form with equal justification: a benevolent genie that solves all the world’s problems, or a mathematician that spends all its time proving theorems so abstract that humans can’t even understand them. But when Silicon Valley tries to imagine superintelligence, what it comes up with is no-holds-barred capitalism.”
In a way, the AI apocalypse that Yudkowsky and Soares are so concerned about is simply our own world, seen through a kind of sci-fi fun-house mirror. Instead of superintelligent AI, we have super-wealthy tech oligarchs; like the hypothetical AI, the oligarchs want to colonize the universe; like the hypothetical AI, the oligarchs do not seem to care much about the desires and well-being of the rest of us. No matter that colonizing the universe is a fundamentally ill-conceived idea; the tech oligarchs—Elon Musk, Jeff Bezos, Marc Andreessen—still want it, along with increasing power and wealth for themselves. In that pursuit, tech CEOs often exacerbate existing societal ills, including climate change and inequality, even while some of them argue that AI will fix many of the same problems.
Against this background, Yudkowsky and Soares want to frame the conversation about AI as being between two main camps: doomers like themselves, who argue that an AI apocalypse is real and imminent; and the accelerationist leaders of tech, including OpenAI CEO Sam Altman, who want to push forward with the race for superintelligent AI in order to achieve a technological utopia.
But the authors believe in the same fantasies as the people they’re arguing against. They accept basically all of the premises held by the utopians and arrive at nearly identical conclusions. Yudkowsky is quite explicit about his desired future, both in the book and in interviews he’s given elsewhere. As he and Soares write, if a superintelligent AI “somehow fulfills someone’s intent and that intent is good, then humanity would be brought to the limits of technology; humanity would get to colonize the stars.“ They acknowledge that they share some of the dreams of the tech billionaires, including curing “not just cancer, but aging.” They quibble only about when we as a species should make the attempt to achieve such dreams: “Someday humanity will have nice things, if we all live, but it’s not worth committing suicide in an attempt to gain the power and wealth of gods in this decade.”
This is a fantasy of oversimplified technological salvation: Just try this one weird trick, and all of humanity’s problems will be solved, forever. The only difference between the authors and their utopian opponents is that the former believe that it will be harder to develop that one last technology safely. Yudkowsky’s fears of extinction are just the flip side of the promise of an AI paradise. But there is nothing inevitable or even particularly likely about apocalypse or utopia, or about the possibility of superintelligent AI itself.