The year is 1956. You’re a researcher working at International Business Machines, the world’s leading tabulating machine company, which has recently diversified into the brand-new field of electronic computers. You have been tasked with determining for what purposes, exactly, your customers are using IBM’s huge mainframes.
This story was first featured in the Future Perfect newsletter.
Sign up here to explore the big, complicated problems the world faces and the most efficient ways to solve them. Sent twice a week.
The answer turns out to be pretty simple: computers are for the military, and for the military alone. In 1955, the year before, by far the biggest single revenue source for IBM’s computer division was the SAGE Project, a Defense Department initiative tasking IBM with creating a computer system capable of providing early warnings across the United States should nuclear-armed Soviet bombers attack the country. That brought in $47 million in 1955, and other military projects brought in $35 million. Programmable computers sold to businesses, meanwhile, brought in a paltry $12 million.
You send a memo to your boss explaining that computers’ impact on society will primarily be in giving the US an edge on the Soviets in the Cold War. The impact on the private sector, by contrast, seems minor. You lean back in your chair, light a cigarette, and ponder the glorious future of the defense-industrial complex.
You would, of course, be totally wrong — not just in the far future but in the very immediate one. Here’s what revenue looked like from each of IBM’s computing divisions in 1952 through 1964, compiled by company veteran Emerson Pugh in his book Building IBM:
A mere two years after 1956, programmable computers sold to private companies had matched SAGE as a revenue source. The year after that, the private sector was bringing in as much as the military as a whole. By 1963, not even a decade after the 1955 data you were looking at, the military appears to be a rounding error next to IBM’s ballooning private computer revenues, which have grown to account for a majority of the company’s entire US revenue.
Whoops!
What can we learn from how people are using AI right now?
This week, impressive teams of economists at both OpenAI and Anthropic released big, carefully designed reports on how people are using their AI models — and one of my first thoughts was, “I wonder what an IBM report on how people used their first computers would’ve looked like.” (Disclosure: Vox Media is one of several publishers that have signed partnership agreements with OpenAI. Our reporting remains editorially independent. Also, Future Perfect is funded in part by the BEMC Foundation, whose major funder was also an early investor in Anthropic; they don’t have any editorial input into our content.)
To be clear: the level of care the AI firms’ teams put into their work is many, many orders of magnitude greater than that shown by our fictional IBM analyst. Revenue isn’t the best measure of actual customer interest and use; everyone knew even in 1955 that computers were improving rapidly and their uses would change; the AI firms have access to an impressive array of real-time data on how their products are used that would have made the Watson family running IBM salivate.
That said, I think the IBM example is useful for clarifying what, exactly, we want to get out of this kind of data.
The AI firms’ reports are most useful at giving us a point-in-time snapshot, and a recent history over a couple of years, of what kind of queries are being sent to ChatGPT and Claude. You might have read my colleague Shayna Korol in Wednesday’s Future Perfect newsletter laying out the OpenAI findings, and I also recommend the study coauthor and Harvard professor David Deming’s summary posts. But some big picture, non-trivial things I’ve learned from the two reports are:
Uptake is skyrocketing: ChatGPT has gone from 1 million registered users in December 2022, to 100 million people using it at least weekly by November 2023, to over 750 million weekly active users now. If the number of messages sent to it keeps growing at the current pace, there will be more ChatGPT queries than Google searches by the end of next year.
Both OpenAI and Anthropic find that richer countries are using AI more than poor ones (no surprise there), but OpenAI intriguingly finds that middle-income countries like Brazil use ChatGPT nearly as much as rich ones like the US.
The biggest use cases for ChatGPT were “practical advice” like how-tos or tutoring/teaching (28.3% of queries), editing or translating or otherwise generating text (28.1%), and search engine-style information queries (21.3%). Anthropic uses different descriptive categories but finds that people using Claude.ai, the ChatGPT-like interface for its models, most commonly use it for computing and math problems (36.9% of usage), while an increasing share use it for “educational instruction and library” work (12.7%).
What can’t we learn?
But I’m greedy. I don’t just want to know the first-order descriptive facts about how these models are used, even though those are the kinds of questions these papers, and the internal data that OpenAI and Anthropic collect more generally, can answer. The questions I really want answered about AI usage, and its economic ramifications, are more like:
Will human and AI labor be complements or substitutes for each other in five years? Ten years? Twenty?
Will wages go up because the economy is still bottlenecked on things only humans can do? Or will they collapse to zero because those bottlenecks don’t exist?
Will AI enable the creation of “geniuses in data centers” — AI agents doing their own scientific research? Will this lead the stock of scientific knowledge about the world to grow faster than ever before? Will that lead to explosive economic growth?
Many people are asking these questions, and an impressive amount of theoretical work has been done in economics already on them. I’ve found this set of lecture slides and paper citations on the subjects from the economist Philip Trammell very useful.
But that theoretical work is mostly in the form of, “what are some concepts that we could use to make sense of what is happening or will shortly happen?” — it’s theory, that’s the point! — and thus leaves a greedy, impatient man like myself without good answers, or even particularly good guesses, at the above questions. It’s a place where I want good empirical research to give me a sense of which theoretical frameworks are corresponding to ground reality.
My fear is that, for reasons the IBM parable explains, empirical details about how AI is being used now can mislead us about how it will be used in the future, and about its most important effects on our lives. If you cryogenically froze our IBM analyst in 1956 and resurrected them today to analyze the OpenAI and Anthropic reports, what would they say about the more speculative questions above?
They might point to the fact that the ChatGPT study found about half of all messages correspond to a pretty small number of “work activities,” as tracked by the Department of Labor, like “documenting/recording information” and “making decisions and solving problems.” Those are big categories for sure, but people have to do a lot else in their work that doesn’t fall under them. Our IBM analyst might conclude that AI is only automating a pretty small share of work tasks, meaning that human and AI labor will complement each other going forward.
Then again, the analyst could look at the Anthropic report which found that “automation” use cases (where you just tell Claude to do something and it does the whole task, perhaps with periodic human feedback) are vastly more common among businesses using Anthropic’s backend to program their own specific Claude-enabled routines than “augmentation” use cases (where you ask Claude for feedback or for learning, etc., and work in concert with it). Augmentation still makes up a bigger share of usage on the Claude.ai website, but the automation share is growing there too. Our analyst might look at this and conclude that AI and human labor will wind up as substitutes, as Claude users are using it less as a sidekick than as an agent doing work on its own.
All of these conclusions would be, I think, premature to the point of recklessness. This is why, to their credit, the authors of both the OpenAI and Anthropic reports are very careful about what they do and don’t know and can and cannot infer from their work. They’re not claiming these findings can tell us about the medium or long-run effects of AI on labor demand, or the distribution of economic growth, or the professions that will be most affected by AI — even though that’s precisely what a lot of outside observers are doing.
Why AI is different from corn (I promise this makes sense)
So let me finish by focusing on something the reports do tell us that is, I think, crucially important. One of the oldest findings in the economics of innovation is that new technologies take time, often a long time, to “diffuse” through the economy.
The classic paper here is Zvi Griliches in 1957 on the spread of hybrid corn. Hybrid corn was not one specific product, but a particular approach to breeding corn seeds optimally for specific soil in specific areas. Once a few farmers in a state adopted hybrid corn, subsequent uptake seemed to be unbelievably fast. Look at those S-curves!
But while diffusion within individual states was fast, diffusion between states wasn’t. Why did Texas need a decade after the rise of hybrid corn in Iowa to realize that this could greatly increase yields? Why did it seem to hit a much lower ceiling of 60-80% usage, compared to universal uptake in Iowa? You also see these kinds of lags when looking at cases like electricity and in datasets covering a wide array of inventions.
Something the Anthropic and OpenAI data tells us pretty clearly is that the diffusion lags for AI are, by historical standards, very short. Adoption of this tech has been rapid, indeed faster than earlier online products like Facebook or TikTok, let alone hybrid corn.
Past general-purpose technologies like electricity or computing took years or decades to diffuse through the economy, which limited their benefit for a time but also gave us time to adapt. We will likely not get that time this go-around.