Why is DeepSeek causing global technology shockwaves?
Matt Sheehan: DeepSeek is a Chinese AI startup that recently released a series of very impressive generative AI models. One of those models, DeepSeek R1, is a “reasoning model” that takes its time to think through an extended chain of logic before it gives an answer. This type of reasoning is a relatively new paradigm that was pioneered by OpenAI last year, and it is viewed by many as the most promising way forward for AI research. In terms of performance, DeepSeek’s new model is roughly on par with OpenAI’s o1 model from last September.
The “reckoning” here comes from how DeepSeek did it: quickly, cheaply, and openly. DeepSeek had finished an initial version of R1 just a couple months after OpenAI’s release, far faster than Chinese companies were able to catch up to U.S. models in previous years. Perhaps most shocking was that DeepSeek was able to generate this performance using far less computing power—a key input for training a model—than U.S. companies. That extraordinary efficiency is likely a knock-on effect of U.S. export controls on chips: Chinese companies have been forced to get very creative with their limited computing resources. And finally, DeepSeek released its model in a relatively open source way, allowing anyone with a laptop and an internet connection to download it for free. That has thrown into doubt lots of assumptions about business models for AI companies and led to the turmoil in U.S. stock markets.
Sam Winter-Levy: Just to give you a sense of DeepSeek’s efficiency, the company claims it trained its model for less than $6 million, using only about 2,000 chips. That’s an order of magnitude less money than what Meta, for example, spent on training its latest system, which used more than 16,000 chips. Now DeepSeek’s cost estimate almost certainly only captures the marginal cost: It ignores their expenditures on building the data centers, buying the chips in the first place, and hiring a large technical team. But regardless, it’s clear that DeepSeek managed to train a highly capable model more efficiently than its U.S. competitors.
Investors seem to be especially concerned about the prospects for leading chip companies, such as Nvidia. Why?
Matt Sheehan: DeepSeek has shown it takes much less compute to train a leading AI model than was believed beforehand. If you assume that demand for AI will remain constant, then these lower needs for compute would translate to less revenue than previously projected.
But it could end up having the opposite effect. If DeepSeek makes accessing these AI models much more affordable, then that could end up increasing total demand for AI services, leading to much more revenue for chip companies. The long-term impacts on compute demand remain deeply uncertain, but the valuation of companies like Nvidia has been growing at such an extraordinary rate for the past few years that a comedown isn’t too surprising.
Sam Winter-Levy: That’s right. The tech giants have been making extraordinary capital expenditures over the past couple of years. Just last week, for example, OpenAI, SoftBank, and Oracle announced a joint venture called Stargate to build at least $100 billion in computing infrastructure, and perhaps as much as $500 billion over four years. At some point they will need to generate the revenue to justify these levels of investment. DeepSeek’s efficiency and its open availability suggested that perhaps the position of the leading U.S. tech giants was less secure than the markets had thought, that Nvidia’s prospects as the provider of vast quantities of chips for the world were somewhat less rosy than anticipated, and that we might be witnessing an overbuilding of AI infrastructure.
As Matt said, in the long run lower costs could drive greater usage, which means we’ll still require vast quantities of computing power and data centers. But it’s not historically unusual for a technology revolution to be accompanied by a lot of turbulence in the stock market and by the incineration of capital, even as cost reductions unleash new waves of innovation. That explains some of what we’ve seen over the last few days with the reaction to DeepSeek.
In its last few months, the Biden administration rolled out a series of mounting export controls on AI chips, particularly directed at China. Does this mean that export controls don’t matter anymore?
Sam Winter-Levy: Almost certainly not. Although DeepSeek has shown that you can use smaller numbers of chips than expected to train an impressive model, it would still benefit from having access to more computing power. The company’s CEO has explicitly said that access to computing power is its primary obstacle. The more chips you have, the more experiments you can run, the more data you can generate, and the more widely you can deploy your most capable models. DeepSeek has shown that with this new class of reasoning model, you can achieve an impressive performance with a small amount of compute. But you can almost certainly achieve vastly more with a large amount of compute! So access to chips will remain a key driver of success in the AI race moving forward.
It’s worth emphasizing that DeepSeek acquired most of the chips it used to train its model back when selling them to China was still legal. Although the export controls were first introduced in 2022, they only began to have a real effect in October 2023, and the latest generation of Nvidia chips has only recently begun to ship to data centers. As these newer, export-controlled chips are increasingly used by U.S. companies, we could see a gap reopen in the United States’ favor. And while DeepSeek’s achievement does cast doubt on the most optimistic theory of export controls—that they could prevent China from training any highly capable frontier systems—it does nothing to undermine the more realistic theory that export controls can slow China’s attempt to build a robust AI ecosystem and roll out powerful AI systems throughout its economy and military. After all, the amount of computing power it takes to build one impressive model and the amount of computing power it takes to be the dominant AI model provider to billions of people worldwide are very different amounts. So access to cutting-edge chips remains crucial.
Of course, this all depends on the U.S. government continuing to tighten loopholes in the export control regime to prevent Chinese chip smuggling, which is one reason why the Biden administration introduced its sweeping new diffusion framework to govern the sale of chips worldwide in its last weeks in office.
How might this news reshape AI development and regulation in China?
Matt Sheehan: The DeepSeek models have really boosted confidence of the Chinese government. But that confidence can be a double-edged sword when it comes to AI policy, one that could have unintended side effects for the industry. How secure the Chinese government feels in the country’s own AI capabilities has a major impact on policy, and the past four years have been a roller coaster ride for those confidence levels.
Before the release of ChatGPT in late 2022, the Chinese government believed it had largely caught up with, or maybe even jumped ahead of, the United States in AI. Perhaps counterintuitively, that confidence helped contribute to the Chinese tech crackdown of 2020-2022, because the government felt it could afford to be a bit more heavy-handed with those companies without fear of falling behind the United States.
The debut of ChatGPT shook that confidence, and it led the Chinese government to ease up substantially on how strictly it regulated generative AI. China’s regulations on AI are still far more burdensome than anything in the United States, but there was a relative softening compared to the worst days of the tech crackdown. DeepSeek and several other leading Chinese frontier AI companies emerged during this period.
Now DeepSeek’s success has endeared it to the government, and that success will probably give the company access to far more resources. But if we look a little further out, as the Chinese government regains confidence in its AI competitiveness, it might once again decide that it’s time for more direct and heavy-handed control over the industry.
Could this announcement move the United States and China closer to an AI arms race?
Matt Sheehan: In many ways the two are already fully locked in a “race” to capitalize on AI, but the exact shape of that race and the end goals continually change. Are we racing to see who can build the single biggest and most powerful model, perhaps even artificial general intelligence (AGI)? Or is the real race to see who can build the really useful and cost-effective models that will be used by people and companies around the world?
Lots of the leading U.S. AI companies—especially frontier AI startups such as OpenAI and Anthropic—are very focused on winning the race to build AGI. Chinese companies talk about this as well, but for a variety of reasons they tend to focus more on the second race: building models that can make money today. DeepSeek’s founder is himself committed to pursuing AGI, but the release of such a cost-efficient model has shifted a lot more attention onto that second race. It remains an open technical and political question which of these races will end up mattering the most.
Sam Winter-Levy: I think this will only intensify the race dynamics. Both are likely to ramp up their efforts to innovate—China because it now knows it can compete, the United States in response to what some advisers to President Donald Trump have called a “Sputnik moment,” which will probably further intensify the administration’s appetite for deregulation, capital expenditures, and energy build out. But even as this race intensifies, neither state is likely to achieve a lasting monopoly on extremely powerful AI systems. We should probably start thinking about how to peacefully navigate a world in which multiple major powers have access to highly capable and potentially dangerous AI systems.
What will you be looking for in the coming weeks in response to this news?
Matt Sheehan: On the technology side, I’ll be watching to see if more companies—both U.S. and Chinese—are able to roughly replicate DeepSeek’s results. Historically this has been the pattern: one company or lab makes a major AI breakthrough that shows the way forward, and then lots of other labs follow quickly on their heels. DeepSeek is the first company to replicate OpenAI’s o1, and it did it at a much lower cost, but it’s very unlikely it will be the last one.
I’ll also be watching for the release, likely in a limited form, of OpenAI’s o3 model. This o3 model is the immediate successor to OpenAI’s o1 (the model DeepSeek matched in performance), and early reports are that o3 dramatically outperforms o1. If that proves true, then the clock is ticking to see if DeepSeek can match that performance, or if issues with access to compute or algorithmic innovation get in the way.
Sam Winter-Levy: I’ll be watching the implications for the export control debate. DeepSeek’s success has turbocharged the narrative that export controls don’t matter or that they may have even backfired, and there will probably be renewed calls from industry to relax them—or at the very least to scrap the Biden administration’s diffusion framework, which sought to use the United States’ AI lead to set the global terms for the market. As I said earlier, I think that view of export controls misses some of the logic behind them and their likely future impact, and the diffusion framework relied in large part on U.S. advantages in manufacturing chips, not just in developing world-beating models—an advantage that remains robust, for now.
I don’t think the Trump administration will respond to a Chinese technological breakthrough by allowing the Chinese to buy more advanced American chips, but I’ll be looking to see if these narratives lead to a relaxation of some of these policies. The recurring tension in debates over export controls is between national security hawks who want to curtail technology flows and U.S. corporations that want to retain access to foreign business opportunities, and which of those constituencies will gain more influence in the new administration is anyone’s guess.
Emissary
The latest from Carnegie scholars on the world’s most pressing challenges, delivered to your inbox.