ASEAN needs to determine how to balance perpetuating the benefits of technology cooperation with China while mitigating the risks of getting caught in the crosshairs of U.S.-China gamesmanship.
Elina Noor
Source: Getty
AI is a rapidly developing industry where pragmatism and dynamism are key. An approach prioritizing early release and iteration may be the best hope to reduce risk at a satisfactory pace.
Artificial intelligence (AI) capabilities are advancing rapidly, and there are serious concerns that AI could reach a point where it poses risks to international security.1 At the same time, AI risk management is “still in its infancy.”2 This creates a dilemma for policymakers. Key AI risks are poorly understood and speculative, and premature regulation—or even premature pressure to follow voluntary safety standards—could be ill-conceived and obstruct progress. But moving too slowly could mean tolerating high levels of risk.
A partial solution to this dilemma is to invest heavily in research on the risks of AI and how to mitigate them, with the goal of achieving a mature understanding of these topics as quickly as possible. However, given the challenges of such research, reaching maturity could easily take decades.
For the best hope of moving faster on risk management, research could be complemented by another approach to developing risk management practices: early release and iteration. This approach can be seen in AI companies’ if-then commitments,3 which are often relatively vague, lack extensive justification, and are explicitly marked as early, exploratory, or preliminary.4 Commitments like these are a sort of minimum viable product. Rather than polished commitments grounded in extensive and unassailable research, they are initial attempts at risk management that a company can try, notice problems with, iterate on, and continually improve as more information and research come in.
This early-release-and-iteration approach is unlike how risk management tends to look in other, more mature industries. Rather, it is more similar to how AI companies develop and deploy their products. For a fast-moving industry where pragmatism and dynamism are key, this approach may be the best hope of developing workable risk reduction practices fast enough to reduce the bulk of the risk.
For this approach to work, it will be important for its practitioners not to confuse it with traditional risk management in mature industries, nor with intensive research isolated from practice. Risk management practices that come from an early-release-and-iteration approach will frequently be under-explained and under-justified and will be later revised to accommodate new developments or improved understanding. Scholars and other critics will be tempted to focus their critiques on the lack of rigor, but it might be more productive to focus critiques on other matters, such as the frequency with which companies revise their frameworks and whether they list and eventually resolve key open questions.
Policymakers, rather than choosing between imposing detailed regulations and waiting for risk management to mature, can aim to accommodate and encourage the fast development of risk management practices and the continuous revision of them.
In some industries, it is common for operators to perform regular, extensive risk assessments. One example is nuclear power; the U.S. Nuclear Regulatory Commission uses probabilistic risk assessment to put numbers on potential risks.5 Risk assessment for nuclear plants focuses on a specific, limited set of risks: those that could cause damage to the nuclear reactor core, resulting in the release of radioactivity.6
Risk management in other industries tends to have a similar quality. For example, approval from the U.S. Food and Drug Administration generally requires empirical studies of a drug’s effects on predefined indicators of health, including positive efficacy and negative side effects.7
By contrast, AI risk, as it is understood today, presents both a broader and a vaguer surface area of potential risks. AI is a technology that could potentially automate anything a human mind can do, and it is advancing rapidly. AI has been the subject of a vast set of concerns, including but far from limited to the manipulation of public opinion, automation of cyber operations, invasions of privacy, proliferation of the capacity to produce and deploy biological and chemical weapons, labor market impacts because of AI’s economic competition with much of the population, amplification of bias, and “loss of control,” which refers to the possibility that AI agents could autonomously work to disempower humans.8 Discussions of these risks tend to emphasize that some of them are speculative, poorly understood, and/or the subject of vigorous disagreement among experts.9
For many of these, attempts at risk management face many hard questions. Consider one example risk: that AI could assist in chemical and biological weapons production.10 To assess and manage this risk, one would ideally like well-grounded answers to questions including: what aspects of weapons production (and/or acquisition) can AI systems enhance? For what types of weapons, and for what types of actors? How much could AI systems help each type of actor of concern with each type of weapon of concern? How can one know which AI systems are capable of such enhancement? What technological measures can be used to ensure that actors of concern can neither elicit assistance with weapons production from AI nor steal AI model weights and fine-tune them for their own purposes?
It is especially hard to get reasonable answers to these questions given that the concern is about hypothetical future AI systems rather than present ones. There are no empirical examples of such AI systems to study, no case studies for such AI-assisted incidents, no statistics that can be used to straightforwardly estimate frequency, and no relevant high-assurance AI safety programs that can be studied to produce standards.
The fact that we do not have such things does not mean that concerns about the risks are unfounded. AI systems have recently been approaching human-expert-level performance on many fronts at once,11 and if they were to reach parity with top experts on chemistry and biology, they could quickly and dramatically expand the set of people able to produce weapons of mass destruction.12 There has been significant concern about these risks from policymakers.13
But getting empirically grounded, thoroughly quantified answers to the questions above may prove intractable until after AI systems that clearly pose the risks in question exist, at which point the risks may be significant.
More generally, understanding and achieving some level of consensus of the full scope of risks, and developing solid and widely used risk management practices around this understanding, could take decades. This would be in line with the history of risk management in other industries.14
Some have advocated that AI development should be delayed until (and unless) risk management becomes mature enough to provide high assurance against risk.15 Others have pointed to the immature state of risk management as reason to delay regulation while AI development moves forward with no restrictions.16 Either approach might sound reasonable at first blush but looks less appealing (and less realistic) when keeping in mind how long the road could be to achieve mature risk management practices.
The companies developing cutting-edge AI systems are not delaying production or release of products while they work to assemble rigorous, comprehensive analysis about their systems’ capabilities, internal workings, and revenue potential. They are, rather, building and releasing AI products with ambition and urgency.
Indeed, the culture of tech companies in general tends to prioritize an ethos of rapidly releasing products and iterating on them, rather than aiming to perfect them—an approach that means products are often limited at a given point in time but that results in fast feedback and improvement.17
The idea of prioritizing rapid iteration over up-front analysis is critical for some of the key players in AI. In addition to informing approaches to products, it has also featured prominently in some of the leading AI companies’ statements of their philosophy for navigating the risks of AI.18
Can this ethos be applied to the development of risk management, as well as to the development of AI itself?
To a significant degree, this is exactly what has been happening with if-then commitmentsreleased by major AI companies over the past year or so, although the pace of iteration could be faster, and the number of companies participating could be greater.19
For example, in early 2024, Google DeepMind released its “Frontier Safety Framework,”20 which lists AI capabilities it intends to test for and enhanced risk mitigations that could be required depending on the results of testing. In its announcement, it explicitly highlighted that the framework is preliminary and a starting point for iteration:
The Framework is exploratory and we expect it to evolve significantly as we learn from its implementation, deepen our understanding of AI risks and evaluations, and collaborate with industry, academia, and government. Even though these risks are beyond the reach of present-day models, we hope that implementing and improving the Framework will help us prepare to address them. We aim to have this initial framework fully implemented by early 2025.
The framework itself contains significant ambiguities and areas that will need further refinement over time. The “critical capability levels” it aims to test for are described at a high level and are based on what is explicitly called “preliminary analysis.” For example, it tests for “[AI systems] capable of fully automating opportunistic cyberattacks on organizations with a limited security posture.” A “future work” section of the document fully acknowledges its preliminary nature and lists a number of hopes for future versions of the framework.
Other if-then commitments have similar properties. OpenAI’s “Preparedness Framework” is marked “Beta” and described as a “living document.”21 When discussing its capabilities of concern, it states:
As mentioned, the empirical study of catastrophic risk from frontier AI models is nascent. Our current estimates of levels and thresholds for ‘medium’ through ‘critical’ risk are therefore speculative and will keep being refined as informed by future research.
Anthropic’s initial announcement of its “Responsible Scaling Policy” stated, “We want to emphasize that these commitments are our current best guess, and an early iteration that we will build on. The fast pace and many uncertainties of AI as a field imply that, unlike the relatively stable BSL system, rapid iteration and course correction will almost certainly be necessary.”22 It has since put out a revised version of its policy, noting many changes that were made to achieve more flexibility after getting experience with implementation.23
One could complain—and some have—that these if-then commitments are overly vague and lack many helpful features of risk management in more mature industries.24 But as of today, the alternative to preliminary, exploratory commitments is not rigorous, reliable commitments—it is more likely to be essentially holding off on risk management until there is a lot more clarity on the risks.
These companies are taking the same approach to risk management that they take to AI systems themselves: build something, try it out, and improve it over time. But there is room for them to do more and with a faster pace of iteration. Early if-then commitments alluded to the need for further work and called out multiple areas for improvement, including aspirations to add oversight from independent third parties.25 But there have been few public updates or revisions to these policies as of today.26 And many more companies have not yet released if-then commitments at all.27 Calls for such if-then commitments to meet some absolute standard of rigor may be less productive than calls for consistent, publicly visible progress and iteration.
A company can put out a voluntary if-then commitment, then publish any number of revisions and refinements as it learns from feedback and implementation. It is much harder for a government to take an approach of putting out regulations early and revising them over time. Every change to legislation presents a new political battle and new set of compromises and complexities, perhaps with a changed balance of power among coalitions since the last time a relevant bill was passed. Assigning an agency to make and revise regulations is itself a hard-to-reverse action, giving a particular set of people discretion and powers that could require a political battle to remove.
Still, it is worth considering how policymakers can balance urgency and uncertainty when it comes to AI regulation. Some options include:
None of these approaches is foolproof, but if done well, such approaches could help push forward the development of both private risk management practices and the state capacity to eventually enforce them, while avoiding getting stuck with requirements based on immature ideas about risk management.
ChatGPT may have set the record for having the fastest-growing user base of all time.28 Indeed, a defining feature of today’s progress in AI is how fast it has been—a source of both excitement and concern about the technology.
If AI continues to progress with unprecedented speed, AI risk management ideally will too. Making that happen could require a messy, iterative process, with if-then commitments and/or unpolished regulations that are not initially grounded in thorough, rigorous research (and require many revisions). Implementing imperfect risk management practices could indeed be the fastest way to gather data and get to the point where thorough, rigorous research is possible.
Concerns that “the science surrounding AI safety is still in its infancy” are valid.29 But if these concerns lead to holding off on any risk management practices until the underlying science is settled, they could mean that the science remains in its infancy too long. Pushing forward the maturation of AI risk management should be treated as an urgent priority—on par with pushing forward the development of AI itself.
The author is married to the president of Anthropic, an AI company, and has financial exposure to both Anthropic and OpenAI via his spouse.
This piece has benefited from a large number of discussions over the years, particularly with people from METR, the UK AI Safety Institute, Open Philanthropy, Google DeepMind, OpenAI, and Anthropic. For this piece in particular, I’d like to thank Chris Painter and Luca Righetti for comments on a draft.
Carnegie does not take institutional positions on public policy issues; the views represented herein are those of the author(s) and do not necessarily reflect the views of Carnegie, its staff, or its trustees.
ASEAN needs to determine how to balance perpetuating the benefits of technology cooperation with China while mitigating the risks of getting caught in the crosshairs of U.S.-China gamesmanship.
Elina Noor
As tech competition moves into the biotech sector, China is increasingly shifting its focus to nearby regions to alleviate U.S.-induced supply chain pressures. As part of this transition, Southeast Asia has emerged as a favored destination.
Xue Gong
Ignoring the problems of its historical precedents won’t make China’s success any more likely.
Michael Pettis
Beijing’s reaction to Washington’s proposed ban on TikTok could manifest in three distinct scenarios.
Xing Jiaying, Li Mingjiang
The government extends the uncertainty with its two new taxes and digital rupee.
Anirudh Burman, Priyadarshini D.