Blue and green lights in wavy lines across the bottom third of the image on a black background

iStock

article

China’s Views on AI Safety Are Changing—Quickly

Beijing’s AI safety concerns are higher on the priority list, but they remain tied up in geopolitical competition and technological advancement.

Published on August 27, 2024

Over the past two years, China’s artificial intelligence (AI) ecosystem has undergone a significant shift in how it views and discusses AI safety. For many years, some of the leading AI scientists in Western countries have been warning that future AI systems could become powerful enough to pose catastrophic risks to humanity. Concern over these risks—often grouped under the umbrella term “AI safety”—has sparked new fields of technical research and led to the creation of governmental AI safety institutes in the United States, the United Kingdom, and elsewhere. But for most of the past five years, it was unclear whether these concerns about extreme risks were shared by Chinese scientists or policymakers.

Today, there is mounting evidence that China does indeed share these concerns. A growing number of research papers, public statements, and government documents suggest that China is treating AI safety as an increasingly urgent concern, one worthy of significant technical investment and potential regulatory interventions. Momentum around AI safety first began to build within China’s elite technical community, and it now appears to be gaining some traction in the country’s top policy circles. In a potentially significant move, the Chinese Communist Party (CCP) released a major policy document in July 2024 that included a call to create “oversight systems to ensure the safety of artificial intelligence.”

There remain major open questions about the specific contours of China’s concerns over AI safety and what it intends to do about them. But the growing political and technical salience of these issues is significant for AI safety and governance globally. China is the key competitor for the United States in advanced AI, and that competition is a core dynamic shaping AI development globally. China’s leaders are acutely concerned with falling further behind the United States and are pushing hard to catch up in advanced AI. How China approaches building those frontier AI systems—the risks it sees and the safeguards it builds in—will influence the safety of systems built in China and around the world.

Contested Science and Translation Troubles

Before examining the evidence of this shift in China, it is important to note that, globally, AI safety remains a deeply contested concept on both a technological and political level. AI scientists and engineers are divided over the likelihood that powerful AI systems could one day pose a threat to humanity and, if so, on what timeline. And in AI policy circles, some view the focus on speculative catastrophic risks as a distraction from present-day AI harms, from biased sentencing algorithms to AI-driven targeting for weapons. This piece will not attempt to adjudicate these issues, focusing instead on developments in how Chinese actors approach the question of catastrophic risks from AI.

Parsing Chinese discussions of AI safety is further complicated by some unfortunate linguistic ambiguity. In Chinese, the word for safety (anquan) can also mean security, depending on the context. This often blurs the line between the types of extreme risks that AI safety connotes in global discourse and a much broader range of security issues in the Chinese context, from the cybersecurity of AI models to content security for sensitive political topics. For this reason, some analysts choose to use safety/security when translating certain ambiguous uses of anquan into English. This piece will follow this practice for ambiguous uses of the term and will employ either safety or security when the meaning appears to be clear from context.

Despite these technological uncertainties and linguistic pitfalls, there is a growing body of evidence to suggest that Chinese scientists and public officials are undergoing a significant shift in how they view AI safety.

Early Signs of Concern

Beginning around 2020, concerns about frontier AI safety began to surface within China’s elite scientific and scholarly community. Some of the country’s most prominent computer scientists and influential policy advisers began publicly discussing safety risks and conducting technical research into mitigations.

One notable early example came from Wen Gao, the director of a major Chinese AI research lab and a dean at Peking University. In 2021, Gao coauthored a research paper about potential risks from artificial general intelligence (AGI) and the technical countermeasures needed to control them. The paper highlighted the potential ability of an AGI system to recursively self-improve, leading to an “intelligence explosion” in which the system far surpasses human cognitive abilities. Gao and his coauthors argued that in such a situation the “default result will inevitably be catastrophic” unless sufficient countermeasures are taken.

These arguments are not especially novel—similar claims have been regularly cited and hotly debated in the global AI research community. But Gao’s paper was notable due to his influence in both technical and political circles in China. Gao serves on several high-level science and technology advisory bodies for the Chinese government, and he is the only scientist who has led the Politburo of the CCP in a “collective study session” on AI. Gao’s paper and his subsequent lectures on the topic were an early sign that some of the most influential scientists in China were thinking about catastrophic risks from AI.

But it initially remained unclear whether those concerns were making their way from scientific research into government deliberations on AI. For years, senior CCP leaders have used the phrase “ensure AI is safe/secure, reliable and controllable,” but the context suggested that the focus was on national security and sovereign control over AI rather than the technical safety of the systems themselves.

The first hint that specific concerns over frontier AI safety were entering policy discourse came in September 2021, when the official AI governance expert committee for the Ministry of Science and Technology (MOST) issued a document titled “Ethical Norms for New Generation Artificial Intelligence.” One of the six ethical norms in the document included a call to “ensure that AI is always under human control,” a central concern in AI safety circles. The document was notable for this reference, but it was a form of soft law issued by a single ministry and didn’t necessarily represent the widely held views of the CCP leadership.

Over the following year, China rolled out its first pair of AI-focused regulations, targeting recommendation algorithms and deepfakes. These were issued by the Cyberspace Administration of China (CAC)—the CCP’s main internet regulator—and they focused on the role of AI in creating and disseminating content online. These regulations created a handful of regulatory tools for governing AI, including requirements that providers complete “safety/security evaluations” on their algorithms. But the goal of those evaluations was to prevent the spread of content that was politically sensitive or infringed on other economic or social rights of Chinese citizens. For the time being, concerns about extreme risks remained on the margins of Chinese technical discussions and policy debates.

ChatGPT Changes the Conversation

The debut of ChatGPT in late 2022 brought AI safety concerns into the limelight. During the first half of 2023, Chinese technical and policy conversations around AI safety ramped up quickly. In June, the Beijing Academy of Artificial Intelligence hosted a full day forum on frontier AI safety, with some of China’s most prominent computer scientists in conversation with their peers from Western countries. During this period, several senior figures in China’s AI community spoke publicly about AI safety concerns. These included the dean of the Beijing Academy of Artificial Intelligence, the former president of Baidu, and, perhaps most importantly, Tsinghua University’s Andrew Yao.

Yao is a giant in the field of computer science. Born in China, Yao spent thirty years at top U.S. universities where his work won him a Turing Award, often called the Nobel Prize for computer science. In 2004, he returned to China as a dean at Tsinghua University, where he created the so-called Yao Class, an extremely selective program whose graduates have gone on to found leading AI startups and conduct research at top universities. Yao is perhaps the most respected computer scientist in China, and his word carries significant weight in academic and policy circles.

Beginning around 2020, Yao referenced issues related to AI safety in public talks. But the frequency and urgency of his remarks on the subject picked up in 2023. Some of Yao’s most high-profile public statements were made jointly with his peers in the international AI community. In October 2023, Yao joined fellow Turing Award winners and AI pioneers Geoffrey Hinton and Yoshua Bengio in coauthoring the paper “Managing extreme AI risks amid rapid progress.” Joining Yao on the paper was Lan Xue, one of China’s most influential scholars of technology policy and the chair of the MOST advisory body that produced the 2021 document on AI ethical principles. The paper bluntly warned that “unchecked AI advancement could culminate in a large-scale loss of life and the biosphere, and the marginalization or extinction of humanity.” It outlined key areas of technical research needed to ensure safety, and called on major tech companies to dedicate one-third of their AI research and development budgets to safety research. Shortly after publication of the paper, Yao and three other prominent Chinese computer scientists joined international peers at the first meeting of the International Dialogues on AI Safety (IDAIS) and signed a joint statement on strategies for mitigating frontier AI risks.

In China, senior scientists and academics often have a stronger voice in policy discussions than their peers in the United States. Whereas AI policy debates in the United States are often dominated by executives from technology companies, Chinese business leaders don’t share that eagerness or ability to shape public debates. Instead, academics tend to make up governmental advisory bodies and are consulted on policy development. With many of the country’s elite scientists warning of these risks, the CCP and Chinese government began issuing clearer and stronger statements on AI safety.

In October 2023, Chinese President and CCP General Secretary Xi Jinping introduced the Global AI Governance Initiative, a short document laying out China’s core principles on international governance. It included a call to “ensure that AI always remains under human control.” That echoed the language from the earlier MOST document on ethical principles, enshrining it as a principle endorsed by top CCP leadership. A few weeks later, China sent then MOST Vice Minister Wu Zhaohui—a former AI researcher at a top Chinese university—to attend the United Kingdom’s AI Safety Summit. There, China joined twenty-eight other parties—including the United States and the European Union—in signing the summit’s Bletchley Declaration, which focused on risks from frontier AI systems and the need for international cooperation to address them.

The decision to attend the summit and sign the declaration were significant markers but still not definitive evidence that the highest levels of CCP leadership were placing real emphasis on AI safety. The Bletchley Declaration didn’t place any new requirements on its signatories and China had plenty of other reasons to attend. Chinese officials have repeatedly expressed concern that China could be cut out of international governance regimes. Whether or not Chinese leaders fully shared the concerns articulated in the declaration, signing it was a way to ensure that Beijing didn’t lose its seat at the proverbial table. It remained to be seen whether these actions reflected genuine concern or simply geopolitical maneuvering.

Over the course of 2024, discussions of AI safety continued to gain traction across China’s technical and policy circles. In March, the second IDAIS dialogue between senior Chinese and Western scientists was held in Beijing, producing a joint statement on recommended redlines that frontier AI development should not cross. Signatories included Yao and other prominent scientists but also representatives from industry such as Peng Zhang, the CEO of top-tier Chinese large-model startup Zhipu AI. Two months later, Zhipu became the first Chinese company to sign the Frontier AI Safety Commitments at the AI Seoul Summit in South Korea, joining many leading international AI companies in pledging to implement safeguards on their large models. That summit also saw the release of the International Scientific Report on the Safety of Advanced AI, a report commissioned by the UK government and led by Yoshua Bengio, on which Yao and another leading Chinese scientist, Ya-Qin Zhang, served as senior advisers.

Alongside these high-level statements of concern from senior figures, Chinese scientists were also ramping up technical research into frontier AI safety problems. These papers addressed many core AI safety questions around alignment and robustness of frontier systems and often built on technical safety work done at Western universities. Much of this work emerged from research groups at top-tier Chinese universities, with relatively little safety work published by China’s leading AI companies—a notable contrast with their U.S. counterparts.

This focus on AI safety found its largest platform in July 2024 at the World AI Conference (WAIC) in Shanghai, China’s most high-profile annual AI gathering. In public talks and closed-door discussions, senior figures from China’s AI ecosystem discussed the technical and regulatory underpinnings of AI safety. Assessments of the problem and recommendations for action were not uniform. While Wen Gao described AGI as a distant possibility, Andrew Yao and Bowen Zhou, in his public debut as director and chief scientist of state-affiliated Shanghai AI Lab, argued for an urgent rebalancing of AI research toward safety.

Shanghai AI Lab has emerged as one of the country’s most prolific producers of technical research on AI safety, and it has also begun producing accompanying policy papers. At WAIC, the lab released a report that it co-authored with a group of influential policy advisers on “AI Safety as Global Public Goods.” The report called for governments to treat the outputs from AI safety research—risk assessment tools and resource libraries—as public goods, something to be publicly funded and widely disseminated through international networks of safety-focused institutions. Taken together, the event showcased the increasingly rich and mature debates about AI safety that had gathered momentum over the preceding two years.

A Call for Oversight

A few weeks later, the CCP leadership made a brief, but potentially consequential, statement on AI safety in a key policy document. The document was the decision of the Third Plenum, a once-every-five-years gathering of top CCP leaders to produce a blueprint for economic and social policy in China. In a section on threats to public safety and security, the CCP leadership called for the country to “institute oversight systems to ensure the safety of artificial intelligence,” according to the official English translation released by the CCP. A more direct translation of the original Chinese text might render the clause as: “Establish an AI safety supervision and regulation system.”

The call for overseeing (or regulating) AI safety in such a major policy document was significant, but it raised as many questions as it answered. What kinds of threats is the CCP referring to by AI safety? And what form of oversight are they envisioning for these systems? We can gain some insight from the document and official commentaries on it.

The first clue comes from where this clause is located within the Third Plenum document. It falls under a large section covering national security risks, in a subsection on “public security governance.” The subsection calls for improving China’s emergency response systems, including “disaster prevention, mitigation, and relief.” AI safety is listed after other major threats to public health and safety, including food and drug safety and “monitoring, early warning, and risk prevention and control for biosafety and biosecurity.” The call for an AI safety oversight system occurs immediately after a call to “strengthen the cybersecurity system,” though that provision remains vague. Given this context, it appears the AI safety risks the CCP is referring to are large-scale threats to public safety, akin to natural and industrial disasters or public health threats. We can say with confidence that AI safety here is not referring to CCP concerns about the content created by generative AI; those are mentioned briefly in a separate section.

In the weeks following the release of the document, the CCP published documents and articles further elaborating on its meaning. These gave mixed signals on the scope and meaning of AI safety. An official explainer accompanying the decision referred broadly to AI’s risks to employment, privacy, and the “norms of international relations,” but didn’t refer explicitly to the types of frontier AI safety risks common in Western discourse. But a follow-up op-ed in the People’s Daily, the official newspaper of the CCP Central Committee, pointed more explicitly at safety risks from frontier AI systems, similar to those referred to in Western AI safety discourse. It stressed the potential safety risks of large models, calling for research into “frontier safety technology,” and the development of advanced technologies for the “safety and controllability of general-purpose large models” (author’s translation).

Taken together, the plenum decision and follow-up documents present a mixed picture of how the CCP is thinking about AI safety. They suggest that the CCP is viewing AI safety risks in the context of large-scale threats to public safety, and that it is planning to take action to mitigate those risks. But exactly which risks the CCP has in mind, and the technical and regulatory interventions needed to mitigate them, remain unclear for the time being. Getting that clarity will require seeing how the Chinese government acts on these high-level calls for AI safety mechanisms.

One indicator to watch is whether China further empowers, or newly creates, institutions focused on frontier AI safety. There are already a handful of prominent labs—some state-affiliated—that are focused on technical evaluations and safety work. The evaluations from these labs could be incorporated into regulations requiring model developers test for certain risks. An even stronger signal would be the creation of an institution, or set of institutions, that play a role analogous to the AI Safety Institutes (AISI) founded by the United Kingdom and the United States last year. The concept of an AISI is both recent and somewhat malleable, but the core mission of both the U.S. and UK institutes is to advance the testing and evaluation of frontier AI systems for safety risks. These institutions and their peers in other countries are now in the process of forming an international network to share information on testing methodologies and emerging risks.

The Chinese government has made no official statement about creating such an institute. Throughout 2024, rumors have circulated that key figures and institutions in China’s AI ecosystem are advocating for one, and that the government is seriously considering the proposal. But significant hurdles remain. China is unlikely to simply copy-paste the U.S. and UK’s institutional structures. Several parts of the Chinese state are competing for leadership on AI governance, and the creation of a new entity requires significant consensus building and political wrangling. If China were to create such an institution or coordinating body for AI safety, the decisions about its mission, leadership, and bureaucratic powers would shed light on the government’s perceptions and priorities in this space.

Safety Engagement Amid Geopolitical Competition

Despite the growing salience of safety concerns, China’s leaders remain just as, if not more, worried about falling further behind the United States in advanced AI. When discussing AI policy, scholars and policy advisers often invoke a long-standing CCP turn of phrase: “Failing to develop is the greatest threat to security.” This ensures that catching up in AI capabilities remains a top CCP priority. But the shifting attitudes toward frontier AI risks means that work on safety will likely rise in tandem.

Regardless of what China does next, international engagement on AI safety will still face enormous hurdles. Both China and the United States have good reason to be deeply suspicious of each other’s intentions in this space. And if extraordinarily powerful—and potentially dangerous—AI systems are on the technological horizon, the competitive pressure to be the first country to build them will be immense. These dynamics will make any binding international agreements on governing frontier AI a long shot.

But even if major agreements remain out of reach, there could still be room for narrow technical engagement around the evaluation and mitigation of certain transnational safety risks. For some powerful technologies, there are safety technologies that a state would want even its geopolitical adversary to have. This logic underpinned Cold War–era agreements in which the United States shared security technologies for preventing accidental nuclear detonations with the Soviet Union. No such technology exists for AI safety today; even if it were to be invented, engagement on it would face enormous technical, strategic, and political obstacles. But a prerequisite to any such engagement is mutual recognition of a common threat, and that appears closer today than it did just a year ago.

Carnegie does not take institutional positions on public policy issues; the views represented herein are those of the author(s) and do not necessarily reflect the views of Carnegie, its staff, or its trustees.