Source: Getty
commentary

The UK AI Safety Summit Opened a New Chapter in AI Diplomacy

Driven to action by the rapid advancements in AI, summit delegates began to map the long road to balancing risk management with innovation in machine learning.

Published on November 9, 2023

No, the British didn’t come close to solving every policy problem involving artificial intelligence (AI) during the UK AI Safety Summit last week. But as delegates from all over the world gathered outside London to discuss the policy implications of major advances in machine learning and AI, UK officials engineered a major diplomatic breakthrough, setting the world on a path to reduce the risks and securing greater benefits from this fast-evolving technology. 

Hosted by Prime Minister Rishi Sunak, the summit beat the odds on several fronts. UK leaders gathered senior government officials, executives of major AI companies, and civil society leaders in a first-of-its-kind meeting to lay the foundations for an international AI safety regime. The result was a joint commitment by twenty-eight governments and leading AI companies subjecting advanced AI models to a battery of safety tests before release, as well as the announcement of a new UK-based AI Safety Institute and a major push to support regular, scientist-led assessments of AI capabilities and safety risks.

The discussion also began to map the long and winding road ahead. Neither technical breakthroughs nor summit agreements will be enough to achieve a sensible balance between risk management and innovation. Crafty diplomacy and pragmatic design of institutional arrangements (such as the international aviation safety process) are also necessary to take on global challenges. Getting either of these in sufficient quantity is a daunting prospect, particularly when both are in short supply and major crises in Ukraine and the Middle East are raging.

Despite the hallway conversations about these geopolitical problems, summit delegates were driven to action by a shared recognition that the most advanced AI systems are improving at startling speeds. The amount of computing power used in training AI systems has expanded over the past decade by a factor of 55 million. The next generation of so-called frontier models, using perhaps ten times as much compute for training as OpenAI’s GPT-4, could pose new risks for society unless suitable safeguards and policy responses are erected quickly. (These models could be available as early as next year.) Even the current generation of AI systems—with guardrails that can all too often be thwarted—appears capable of assisting malicious actors in producing disinformation and designing threatening code more effectively. A well-regarded, private sector delegate with knowledge of what is happening at the forefront of AI development suggested that between 2025 and 2030, emerging systems could pose a risk of rogue behaviors that may be difficult for humans to control.

Given these risks, the summit’s progress was nothing less than a major diplomatic achievement. The UK enticed not only the EU and the United States but also China and major developing countries—including Brazil, India, and Indonesia—to sign the joint commitment on predeployment testing. The UK and the United States each announced the creation of an AI Safety Institute, the first two in an envisioned global network of centers. Even more importantly, the summit generated support for an international panel of scientists assembled under AI luminary Yoshua Bengio that will produce a report on AI safety. This panel can be the first step toward a permanent organization dedicated to equipping the international community with scientific assessments of current and projected capabilities of advanced AI models.

Courtesy of Eric Schmidt

The summit also spurred other jurisdictions toward faster and potentially more comprehensive action. In the days before the summit, the White House issued a thorough executive order that included a requirement that certain companies disclose training runs (as recommended in a recent Carnegie piece), as well as testing information, to the government for advanced AI models that could threaten national security. The Frontier Model Forum—created by Anthropic, Google, Microsoft, and OpenAI to share AI safety information and enhance best practices—named its first director. The G7—working under the auspices of the Japan-led Hiroshima Process—released a draft code of conduct to guide the behavior of organizations developing and deploying advanced AI systems. The United Nations appointed an international panel of experts to advise the secretary-general on AI governance.

As policymakers now discuss how best to weave together these efforts, the relationships forged and trust built between actors leading up to and during the UK summit is arguably just as consequential as the commitments unveiled. Ministers for digital policy—many of them the first in their countries to occupy such positions—mingled with diplomats, entrepreneurs, and private sector leaders such as Elon Musk, as well as research talent and representatives from civil society. Many were meeting for the first time. South Korea and France deserve credit for agreeing to host the next two summits, which will be critical to strengthening these emerging ties and spurring further progress on discrete policy questions. Such questions will include how to gauge increases in AI model capabilities, as well as institutional design problems affecting the world’s capacity to spread access to frontier-level AI technology without increasing risks of misuse.

The delegates’ debates over these questions also revealed much about the novel rhythms and complexities of twenty-first-century tech diplomacy—including the essential role institutions such as the Carnegie Endowment can play to broker diplomatic breakthroughs where connective tissue is otherwise lacking. Behind the scenes, Carnegie staff worked with the UK to support elements of the summit and discern critical issues. We were on the forefront of envisioning and making the case for an international panel of experts to validate technical knowledge, create greater scientific consensus, and engage countries from every part of the world. We helped sketch out the possibility of an AI institute and advised on how its chances of success might be greatest. And we made the case for an international commitment that sophisticated AI models be tested before they are released.

Plenty of technical and standard-setting work remains to secure a pathway for humanity to maximize the benefits of frontier AI technology. Challenges include creating the “tripwires” that would subject certain models to heightened scrutiny and constraints, as well as developing AI safety research that more thoroughly incorporates the complexities of human interactions with AI systems. Another task is making sense of how frontier AI technology will behave when it is eventually incorporated into billions of automated problem-solving software “agents” interacting with one another as they work to fulfill human requests.

Advancing a robust agenda to deal with these issues requires a mix of nuance, coalition-building, and institutional design. Despite relative consensus among delegates on a range of issues, such as the need for careful attention to the proliferation risks of lethal autonomous weapons, the AI safety community encompasses divergent views on matters such as how to handle forthcoming sophisticated, open-source models that can conceivably raise disinformation or national security challenges. While most of the community recognizes serious risks from completely open-source models, a few puritans steadfastly preached open-source orthodoxy. More accessible models come with a greater chance of misuse but could also help to prevent the concentration of economic power in a handful of companies.

Delegates also disagreed about how expansive in scope the policy agenda should be. Some urged participants not to lose sight of observable challenges like risks of bias, disinformation, and the potential for labor market disruptions in the pursuit of managing catastrophic risks that may appear more abstract. Others trained attention on ensuring the public in both developing and wealthier countries benefits fully from the promise of AI and technology transfer, avoiding discrimination and exploring ways that AI can benefit participatory governance and development. Few participants denied the importance of these issues, but debates about how to address them in both international and domestic policymaking settings were abundant.

The most daunting questions took center stage during the final, closed session with Sunak, U.S. Vice President Kamala Harris, European Commission President Ursula von der Leyen, Italian Prime Minister Giorgia Meloni, CEOs of frontier labs and major tech companies, and select civil society groups, including Carnegie. These dilemmas included how to best define thresholds of capability or model complexity that make AI systems dangerous; how to best engage the full range of countries around the world, including China, in productive AI policy discussions; how to incorporate human values into AI systems when people and cultures disagree so vigorously about their ideals; and how to “trust but verify” that reasonable behavior ensues from countries agreeing to collaborate in enhancing AI safety. Looming in the background was a broader query: about how frontier AI technology may, like the internet once did, upend assumptions about the coalitions and ideas that will drive political, economic, and social change in the next decade.

These are challenges that Carnegie will continue exploring in its own AI-focused endeavors: how to balance the merits of freely shared, open-source AI models with effective policies limiting proliferation risks; how to leverage existing laws subjecting AI systems to civil liability without unnecessarily stifling innovation; and how democracy can benefit from AI while diminishing risks of misinformation. Also on the agenda is how to engage governments from the developing world—representing billions of people seeking to join the global middle class, whose livelihoods will likely depend on their relationship to these models—in the sometimes cacophonous conversation about the potential of AI systems to upend assumptions and deliver new possibilities.

The summit itself delivered new possibilities in light of the new institutes announced, the testing agreement, and the scientific report-writing process. But there is a subtle irony in the choice of Bletchley Park as the location—a setting associated with an enormous technical breakthrough that served the cause of peace. At Bletchley Park, Alan Turing and his colleagues used early computing power to crack the Nazis’ Enigma code, by some accounts shortening World War II by months or years. In the years following the war, he turned to exploring what it meant for machines to be “intelligent.” The world in which he explored those questions faced difficult challenges of institutional design and diplomacy. Policymakers scrambled to keep the peace by creating institutions such as the United Nations and NATO and to grow prosperity—however imperfectly—through the Bretton Woods system and the creation of specialized agencies like the International Civil Aviation Organization.

The leaders attending the UK summit now face similar questions in a new era of geopolitical changes and technological breakthroughs. As these leaders sketch the next few chapters of global AI policy, they would do well to remember that the well-being of a planet brimming with ever-more-powerful AI systems depends as much as ever on perceptive questions, savvy diplomacy, and deftly crafted institutions.