How to fine-tune AI for prosperity
When Chad Syverson loads the US Bureau of Labor Statistics website these days looking for the latest data on productivity, he does so with a sense of optimism that he hasn’t felt in ages.
The numbers for the last year or so have been generally strong for various financial and business reasons, rebounding from the early days of the pandemic. And though the quarterly numbers are notoriously noisy and inconsistent, the University of Chicago economist is scrutinizing the data to spot any early clues that AI-driven economic growth has begun.
Any effect on the current statistics, he says, will likely still be quite small and won’t be “world-changing,” so he’s not surprised that signs of AI’s impact haven’t been detected yet. But he’s watching closely, with the hope that over the next few years AI could help reverse a two-decade slump in productivity growth that is undermining much of the economy. If that does happen, Syverson says, “then it is world changing.”
The newest versions of generative AI are bedazzling, with lifelike videos, seemingly expert-sounding prose, and other all too humanlike behaviors. Business leaders are fretting over how to reinvent their companies as billions flow into startups, and the big AI companies are creating ever more powerful models. Predictions abound on how ChatGPT and the growing list of large language models will transform the way we work and organize our lives, providing instant advice on everything from financial investments to where to spend your next vacation and how to get there.
But for economists like Syverson, the most critical question around our obsession with AI is how the fledgling technology will (or won’t) boost overall productivity, and if it does, how long it will take. Think of it as the bottom line to the AI hype machine: Can the technology lead to renewed prosperity after years of stagnant economic growth?
Productivity growth is how countries become richer. Technically, labor productivity is a measure of how much a worker produces on average; innovation and technology advances account for most of its growth. As workers and businesses can make more stuff and offer more services, wages and profits go up—at least in theory, and if the benefits are shared fairly. The economy expands, and governments can invest more and get closer to balancing their budgets. For most of us, it feels like progress. It’s why, until the last few decades, most Americans believed their standard of living and financial opportunities would be greater than those of their parents and grandparents.
But when productivity growth is flat or nearly flat, the pie is no longer growing. Even a 1% annual slowdown or speedup can spell the difference between a struggling economy and a flourishing one. In the late 1990s and early 2000s, US labor productivity grew at a healthy rate of nearly 3% a year as the internet age took off. (It grew even faster, well over 3%, in the booming years after World War II). But since about 2005, productivity growth in most advanced economies has been dismal.
There are various possible culprits to blame. But there is a common theme: The seemingly brilliant technologies invented over the last two decades, from the iPhone to ubiquitous search engines to all-consuming social media, have grabbed our attention yet failed to deliver large-scale economic prosperity.
In 2016, I wrote an article titled “Dear Silicon Valley: Forget Flying Cars, Give Us Economic Growth.” I argued that while Big Tech was making breakthrough after breakthrough, it was largely ignoring desperately needed innovations in essential industrial sectors, such as manufacturing and materials. In some ways, it made perfect financial sense: Why invest in these mature, risky businesses when a successful social media startup could net billions?
But such choices came with a cost in sluggish productivity growth. While a few in Silicon Valley and elsewhere became fabulously wealthy, at least some of the political chaos and social unrest experienced in a number of advanced economies over the last few decades can be blamed on the failure of technology to increase financial opportunities for many workers and businesses and expand vital sectors of the economy across different regions.
Some preach patience: The breakthroughs will take time to work through the economy but once they do, watch out! That’s probably true. But so far, the result is a deeply divided country where the techno-optimism—and immense wealth—oozing out from Silicon Valley seem relevant to only a few.
It’s still too early to know how things will shake out this time around—whether generative AI is truly a once-in-a-century breakthrough that will spur a return to financial good times or whether it will do little to create real widespread prosperity. Put another way, will it be like the harnessing of electricity and the invention of the electric motor, which led to an industrial boom, or more like smartphones and social media, which have consumed our collective consciousness without bringing significant economic growth?
For AI, particularly generative models, to have a greater economic impact than other digital advances over the last few decades, we will need to use the technology to transform productivity across the economy—even in how we generate new ideas. It’s a huge undertaking and won’t happen overnight, but we’re at a critical inflection point. Do we start down that path to broadly increased prosperity, or do the creators of today’s breakthrough AI continue to ignore the vast potential of the technology to truly improve our lives?
Cold water on (over)heated speculation
A series of studies over the last year show how generative AI can boost productivity for people doing various jobs. Economists at Stanford and MIT have found that those working in call centers are 14% more productive when using AI conversational assistance; notably, there was a 35% improvement in the performance of inexperienced and low-skilled workers. Another study showed that software engineers could code twice as fast with the technology’s help.
Last year, Goldman Sachs calculated that generative AI would likely boost overall productivity growth by 1.5 percentage points every year in developed countries and increase global GDP by $7 trillion over 10 years. And some predict that he effects will appear soon.
Anton Korinek, an economist at the University of Virginia, says the added growth has not yet shown up in the productivity numbers because it takes time for generative AI to diffuse throughout the economy. But he predicts a 1% to 1.5% boost to US productivity by next year. And if there continue to be breakthroughs in generative AI models—think ChatGPT5—the eventual impact could be “significantly higher,” says Korinek.
Not everyone is so bullish. Daron Acemoglu, an MIT economist, says his calculations are a “corrective against those who say that within five years the entire US economy is going to be transformed.” As he sees it, “generative AI could be a big deal. We don’t know yet. But if it is, we’re not going to see transformative effects within 10 years—it’s too soon. It will take time.”
In April, Acemoglu posted a paper predicting that generative AI’s impact on total factor productivity (TFP)—the portion that specifically reflects the contribution from innovation and new technologies—will be around 0.6% in total over 10 years, far less than Goldman Sachs and others expect. For decades, TFP growth has been sluggish, and he sees generative AI doing little to significantly reverse the trend—at least in the short term.
Acemoglu says he expects relatively modest productivity gains from generative AI because its Big Tech creators have largely had a narrow focus on using AI to replace people with automation and to enable “online monetization” of search and social media. To have a greater impact on productivity, he argues, AI needs to be useful for a far broader portion of the workforce and relevant for more parts of the economy. Critically, it needs to be used to create new types of jobs, not just to replace workers.
Acemoglu argues that generative AI could be used to expand the capabilities of workers by, for example, supplying real-time data and reliable information for many types of jobs. Think of an intelligent AI agent, but one versed on the intricacies of, say, factory-floor production. Yet, he writes, “these gains will remain elusive unless there is a fundamental reorientation of the [tech] industry, including perhaps a major change in the architecture of the most common generative AI models.”
It’s tempting to think that perhaps it’s simply a matter of tweaking today’s large foundation models with the appropriate data to make them widely useful for various industries. But in fact, we will need to rethink the models and how they can be more effectively deployed in a far broader range of uses.
Producing progress
Take manufacturing. For many years, it was one of the important sources of productivity gains in the US economy. It still accounts for much of the country’s R&D. And recent increases in automation and the use of industrial robots might suggest that manufacturing is becoming more productive—but that has not been the case. For somewhat mysterious reasons, productivity in US manufacturing has been a disaster since about 2005, which has played an outsize role in the overall productivity slowdown.
The promise of generative AI in reviving productivity is that it could help integrate everything from initial materials and design choices to real-time data from sensors embedded in production equipment. Multimodal capabilities could allow a factory worker to, say, snap a picture of a problem and ask the AI model for a solution based on the image, the company’s operating manual, any relevant regulatory guidelines, and vast amounts of real-time data from the machinery.
That’s the vision, at least.
The reality is that efforts to deploy today’s foundation models in design and manufacturing are in their very early days. Use of AI so far has been limited to “narrow domains,” says Faez Ahmed, an MIT mechanical engineer specializing in machine learning—think scheduling maintenance on the basis of data from a particular piece of equipment. In contrast, generative AI models could, in theory, be broadly useful for everything from improving initial designs with real data to monitoring the steps of a production process to analyzing performance data on the factory floor.
In a paper released in March, a team of MIT economists and mechanical engineers (including Acemoglu and Ahmed) identified numerous opportunities for generative AI in design and manufacturing, before concluding that “current [generative AI] solutions cannot accomplish these goals due to several key deficiencies.” Chief among the shortcomings of ChatGPT and other AI models are their inability to supply reliable information, their lack of “relevant domain knowledge,” and their “unawareness of industry-standards requirements.” The models are also ill designed to handle the spatial problems on manufacturing floors and the various types of data created by production equipment, including old machinery.
The biggest difficulty is that existing generative AI models lack the appropriate data, says Ahmed. They are trained on data scraped from the internet, and “it’s a lot more about cats and dogs and multimedia content rather than how do you actually operate a lathe machine,” he says. “The reason these models perform relatively poorly on manufacturing tasks is that they’ve never seen manufacturing tasks.”
Gaining access to such data is tricky because much of it is proprietary. “Some people are really scared that a model will take my data and run away with it,” he says. A related problem is that manufacturing requires precision and, often, adherence to strict industry or government guidelines. “If the systems are not precise and not trustworthy, people are less likely to use them,” he says. “And it’s a chicken-and-egg problem: because the models are not precise; because there is no data.”
The MIT researchers called for a “next generation” of AI models that would be tailored to manufacturing. But there is a problem: Creating a manufacturing-relevant AI that takes advantage of the power of foundation models will require close collaboration between industry and AI companies, and that’s something still in its nascent stage.
The lack of progress so far, says Ranveer Chandra, managing director of research for industry at Microsoft Research, “is not because people are not interested, or they don’t see the business value.” The holdup is finding ways to secure the data and make sure it is in a useful form and provides relevant answers to specific manufacturing questions.
Microsoft is pursuing several strategies. One is asking the foundation model to base its answers on a company’s proprietary data—say, a company’s operations manual and production data. A far more difficult but appealing alternative is fine-tuning the underlying architecture of the model to better suit manufacturing. Yet another approach: so-called small language models, which also can be trained specifically on the data from a company. Since they are smaller than foundation models like GPT-4, they need less computational power and can be more targeted to specific manufacturing tasks.
“But this is all research at this point,” says Chandra. “Have we solved it? Not yet.”
A gold mine of new ideas
Using AI to boost scientific discovery and innovation could have the greatest overall productivity impact over the long term. Economists have long recognized new ideas as the source of long-term growth, and the hope is that new AI tools could turbocharge the search for them. While improving the efficiency of, say, a call center worker could mean a one-time jump in productivity in that business, using AI to improve the process of inventing new technologies and business practices—to create useful new ideas—could lead to an enduring increase in the rate of economic growth as it reshapes the innovation process and the way research is done.
There are already tantalizing clues to AI’s potential.
Most notably, Google DeepMind, which defines its mission as “solving some of the hardest scientific and engineering challenges of our time,“ says more than 2 million users have accessed its deep-learning AI system to predict protein folding. Many drugs target a particular protein, and knowing the 3D structure of such proteins—something that traditionally takes painstaking lab analysis—could be an invaluable step in creating new medicines. In May, Google released AlphaFold 3, claiming it “predicts the structure and interactions of all of life’s molecules“ to help identify how various biomolecules alter each other, providing an even more powerful guide for finding new drugs.
Creators of AI models, including DeepMind and Microsoft Research, are also working on other problems in biology, genomics, and materials science. The hope is that generative AI could help scientists glean key information from the vast data sets common in these fields, making it easier and faster to, say, discover new drugs and materials.
We badly need such a boost. A few years ago, a team of leading economists wrote a paper called “Are Ideas Getting Harder to Find?“ and found that it takes more and more researchers and money to find the kinds of new ideas that are key to sustaining technology advances. The problem, in technical terms, is that research productivity—the output of ideas given the number of scientists—is falling rapidly. In other words—yes, ideas are getting harder to find. We’ve generally kept up by adding more researchers and investing more in R&D, but overall US research productivity itself is in a deep decline.
To uphold Moore’s Law, which predicts that the number of transistors on a chip will double roughly every two years, the semiconductor industry needs 18 times more researchers than it had in the early 1970s. Likewise, it takes far more scientists to come up with roughly the same number of new drugs than it did a few decades ago.
Could AI dream up safe and effective new drugs and find astonishing new materials for computation and clean energy?
John Van Reenen, a professor at the London School of Economics and one of the authors of the paper, knows it’s still too early to see any real change in the productivity data from AI, but he says, “The hope is that [it] can make some difference.” AlphaFold is “a poster child” for how AI can change science, he says, and “the question is whether this can go from anecdotes to something more systematic.”
The ambition is not only to supply various tools that will make the lives of scientists easier, like automated literature search, but for AI itself to come up with original and useful scientific ideas that would otherwise evade researchers. In that vision, AI dreams up new compounds that are more effective and safer than existing drugs, and astonishing materials that expand the possibilities of computation and clean energy. The goal is especially compelling because the universe of potential molecules is virtually unlimited. Navigating such a nearly infinite space and exploring the vast number of possibilities is what machine learning is especially good at.
But don’t hold your breath for AI’s Thomas Edison moment. Though the scientific popularity of AlphaFold has raised expectations for the potential of AI, it is still very early days in turning the research into actual products—whether new drugs or novel materials. In a recent analysis, a team of MIT scientists put it this way: “Generative AI has undoubtedly broadened and accelerated the early stages of chemical design. However, real-world success takes place further downstream, where the impact of AI has been limited so far.”
In fact, the process of turning the intriguing scientific advances in using AI into actual, useful stuff is still very much in its infancy.
It’s a material world
Perhaps nowhere is the excitement over AI’s potential to transform research greater than in the often neglected field of materials discovery. The world desperately needs better materials. We need them for cheaper and more powerful batteries and solar cells, and for new types of catalysts that would make cleaner industrial processes possible; we need practical high-temperature superconductors to revolutionize how we transport electricity.
So when DeepMind said it had used deep learning to discover some 2.2 million inorganic crystals—including some 380,000 predicted to be stable and promising candidates for actual synthesis—the report was greeted with great excitement, especially in the AI community. A materials revolution! It seemed like a gold mine of new stuff—“an order-of-magnitude expansion in stable materials known to humanity,” wrote the DeepMind researchers in Nature. The DeepMind database, called GNoME (an acronym for “graph networks for materials exploration”), is “equivalent to 800 years of knowledge,” according to the company’s media release.
But in the months after the paper, some researchers disputed the hype. Materials scientists at the University of California, Santa Barbara, published a paper in which they reported finding “scant evidence“ that any of the structures in the DeepMind database fulfilled the “trifecta of novelty, credibility, and utility.“
For some tasked with finding new materials, the huge databases of possible inorganic crystals, many of which may not be stable enough to actually exist, seems like a distraction. “If you spam us with 400,000 new materials and we don’t even know which one of those are realistic, then we don’t know which one of those will be good for a battery or catalyst or whatever you want to make them. Then this information is not useful,” says Leslie Schoop, a chemist at Princeton who co-wrote a paper critiquing an effort to make some of the structures predicted by the DeepMind database in an autonomous lab using robots and machine learning.
To be clear, this doesn’t mean that AI won’t prove to be important in materials science and chemistry. Even critics say they are excited by the long-term possibilities. But the criticisms hint at just how early we are in using AI to tackle the daunting task of materials discovery and making it a reliable tool for finding new compounds that are better than existing ones.
It’s extremely expensive and time-consuming to make and test any possible new material. What industrial researchers really need are reliable clues pointing to materials that are predictably stable, can be synthesized, and likely have intriguing properties, including being cheap to make.
The GNoME database probably includes interesting compounds, say its DeepMind scientific creators. But they acknowledge it’s only a preliminary step in showing how AI could help in materials discovery. Much work remains to broaden its usefulness.
Ekin Dogus Cubuk, a Google research scientist and coauthor of the Nature paper, describes the work it reports as an advance in predicting a large number of possible inorganic crystals that are stable, based on quantum-mechanical calculations, at absolute zero, where atomic motion comes to a standstill. Such predictions could be useful for those running computational simulations of new materials—a very early stage of materials discovery.
But, he says, machine learning has not yet been used to predict crystals that are stable at room temperature. After that is achieved comes the goal of using AI to predict how structures can be synthesized in the lab, and eventually how to make them at larger scale. All that must be done before machine learning can really transform the lengthy and expensive process of coming up with new materials, he says.
For those hoping that AI models could boost economic productivity by transforming science, one lesson is clear: Be patient. Such scientific advances could well have an impact one day. But it will take time—likely measured in decades.
The Solow paradox
As senior vice president for research, technology, and society at Google, James Manyika is unsurprisingly enthusiastic about the huge potential for AI to transform the economy. But he is far from an unabashed cheerleader, mindful of the lessons gleaned from his years of studying how technologies affect productivity.
Before joining Google in 2022, Manyika spent several decades as a consultant, a researcher, and finally chairman of the McKinsey Global Institute, the economic research arm of the consulting giant. At McKinsey he became a leading authority on the link between technology and economic growth, and he counts Robert Solow—the MIT economist who won the 1987 Nobel Prize for explaining how technological advances are the main source of productivity growth—as an early mentor.
Among the lessons from Solow, who died late last year at the age of 99, is that even powerful technologies can take time to affect economic growth. In 1987, Solow quipped: “You can see the computer age everywhere but in the productivity statistics.” At the time, information technology was undergoing a revolution, most visible with the introduction of the personal computer. Yet productivity, as measured by economists, was sluggish. This became known as the Solow paradox. It wasn’t until the late 1990s, decades after the birth of the computer age, that productivity growth began to finally pick up.
History has taught Manyika to be circumspect in predicting how and when the overall economy will feel the impact of generative AI. “I don’t have a time frame,” he says. “The estimates [of productivity gains] are generally spectacularly large, but when it comes to a question of time frame, I say ‘It depends.’”
Specifically, he says it depends on what economists call “the pace of diffusion”—basically, how quickly users take up the technology both within sectors and across sectors. It also hinges on the ability of various users, especially businesses in the largest sectors of the economy, to “[reorganize] functions and tasks and processes to capitalize on the technology” and to make their operations and workers more productive. Without those pieces, we’ll be stuck in “Solow paradox land,” says Manyika.
“Tech can do whatever tech wants, and it doesn’t really matter from a labor productivity standpoint,” he says, since its workforce is relatively small. “We have to have changes happen in the largest sectors before we can start to see productivity gains at an economy level.”
Late last year, Manyika co-wrote a piece in Foreign Affairs called “The Coming AI Economic Revolution; Can Artificial Intelligence Reverse the Productivity Slowdown?” In it, the authors offered a decidedly optimistic though cautious answer.
“By the beginning of the next decade, the shift to AI could become a leading driver of global prosperity,” they wrote, because it has the potential to affect “just about every aspect of human and economic activity.” They added: “If these innovations can be harnessed, AI could reverse the long-term declines in productivity growth that many advanced economies now face.” But it’s a big if, they acknowledged, saying it “won’t happen on its own” and will require “positive policies that foster AI’s most productive uses.”
The call for policies is a recognition of the immense task ahead, and an acknowledgment that even giant AI companies like Google can’t do it alone. It will take widespread investments in infrastructure and additional innovations by governments and businesses.
Companies ranging from small startups to large corporations will need to take the foundation models, such as Google’s Gemini, and “tailor them for their own applications in their own environments in their own domains,” says Manyika. In a few cases, he says, Google has done some of the tailoring, “because it’s kind of interesting to us.”
For example, Google released Med-Gemini in May, using the multimodal abilities of its foundation model to help in a wide range of medical tasks, including making diagnostic decisions based on imaging, videos of surgeries, and information in electronic health records. Now, says Manyika, it’s up to health-care practitioners and researchers to “think how to apply this, because we’re not in the health-care business in that way.” But, he says, “it is giving them a running start.”
But therein lies the great challenge going forward if AI is to transform the economy.
Despite the fanfare around generative AI and the billions of dollars flowing to startups around the technology, the speed of its diffusion into the business world is not all that encouraging. According to a survey of thousands of businesses by the US Census Bureau, released in March, the proportion of firms using AI rose from about 3.7% in September 2023 to 5.4% this February, and it is expected to reach around 6.6% by the end of the year. Most of this uptake has come in sectors like finance and technology. Industries like construction and manufacturing are virtually untouched. The main reason for the lack of interest: what most companies see as the “inapplicability” of AI to their business.
For many companies, particularly small ones, it still takes a huge leap of faith to bet on AI and invest the money and time it takes to reorganize business functions around it. In addition to not seeing any value in the technology, lots of business leaders have ongoing questions over the reliability of the generative AI models—hallucinations are one thing in the chat room but quite something else on the manufacturing floor or in a hospital ER. They also have concerns over data privacy and the security of proprietary information. Without AI models more tailored to the needs of various businesses, it’s likely that many will stay on the sidelines.
Meanwhile, Silicon Valley and Big Tech are obsessed with intelligent agents and with videos vreated by generative AI; individual and corporate fortunes are being amassed on the promise of turbocharging smartphones and internet searches. As in the early 2010s, much of the rest of the economy is being left out. They’re not benefiting either from the financial rewards of the technology or from its ability to expand large sectors and make them more productive.
Maybe it’s too much to expect Big Tech to change, to suddenly care about using its massive power to benefit sectors such as manufacturing. After all, Big Tech does what it does.
And it won’t be easy for AI companies to rethink their huge foundation models for such real-world problems. They will need to engage with industry experts from a wide variety of sectors and respond to their needs. But the reality is that the big AI companies are the only organizations with the vast computational power to run today’s foundation models and the talent to invent the next generations of the technology.
So like it or not, in dominating the field, they have taken on the responsibility for its broad applicability. Whether they will shoulder that responsibility for all our benefit or (once again) ignore it for the siren song of wealth accumulation will eventually reveal itself—perhaps initially in those often nearly indecipherable quarterly numbers from the US Bureau of Labor Statistics website.