Table of Contents

Apart from the well-rehearsed scenario of a freeze on plutonium production at North Korea’s Yongbyon facility, any prospective agreement that freezes or otherwise proscribes activity in other parts of North Korea’s nuclear weapons complex would involve far more complicated issues of monitoring and verification. Especially if an agreement were to get into issues on missile forces or weapons-related R&D—both of which would be necessary to ultimately reach denuclearization—novel and innovative approaches would be needed for monitoring, assessing, and ultimately verifying North Korea’s compliance.

Probabilistic verification offers a compelling alternative. It is a framework for verifying complex nuclear agreements in conditions where access and confidence are limited—as is likely to be the case in any near-term scenario with North Korea. If an agreement with Pyongyang is pursued, policymakers and negotiators should keep the virtues of probabilistic verification in mind as they consider how to approach verification and monitoring.

The Traditional Approach to Verification

Verification is the process of deciding whether to believe a party is meeting their commitments under an agreement. This process is often informed by monitoring systems, including sensors that detect anomalous or noncompliant behavior. For example, a sensor that measures the enrichment levels of uranium hexafluoride gas could serve as part of a monitoring system for verifying an agreement designed to prevent a country from producing highly enriched uranium.

Thomas MacDonald
Thomas MacDonald is a fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace.
More >

Monitoring and verification have historically been approached by focusing narrowly on monitoring a small handful of activities where a breach of the agreement could be easily detected with high confidence. For example, under the Joint Comprehensive Plan of Action with Iran, uranium enrichment at the country’s Natanz facility was monitored with an online enrichment monitoring system that could detect if Iran were enriching above acceptable limits in real time. This is an effective approach to verification that has been successfully applied to arms control agreements over the years. However, it suffers from key problems in the context of North Korea, owing to the breadth and complexity of the country’s nuclear program, limited outside knowledge of the program’s elements, potentially limited onsite access for inspectors, and the lack of trust between the negotiating parties.

This narrow approach can only be applied to verifying agreements when the activities being limited can be monitored with high confidence. Limits on deployed U.S. and Russian nuclear warheads and delivery systems under the 2010 Strategic Arms Reduction Treaty (New START), for example, are straightforward to monitor with a combination of overhead imagery and onsite inspections.

Some activities, however, do not lend themselves to easy monitoring. Difficulties may arise when limits are placed on small objects that are easy to hide (such as nondeployed nuclear warheads), where there is ambiguity about an object’s purpose (as with dual-use technologies), or because the activity is inherently low in visibility (in cases of certain weaponization activities, such as hydrodynamic calculations). Achieving high confidence in monitoring under these conditions, if even possible, may require a level of intrusiveness that may not be politically acceptable.

This narrow approach can also conflate verification and monitoring. While it may be attractive to interpret a detection of a breach by a monitoring system as definitive proof of a breach in the relevant agreement, compliance assessments are ultimately political judgment calls. All monitoring systems are imperfect—and missed detections and false alarms are inevitable—so no individual piece of data is above reproach. False alarms can be politically damaging to an agreement as they undermine trust on both sides. Taking a reductive view of monitoring and verification—namely, that something is only verifiable if it can be monitored with high confidence—ignores the fact that human judgment is a necessary component in any compliance assessment.

These challenges are particularly acute in the case of North Korea. Depending on the precise scope of an agreement with Pyongyang, there may be a wide array of items and activities that would need to be monitored (including fissile material, warheads, and ballistic missiles), each of which would be more or less difficult to monitor. North Korean concerns over the intrusiveness of verification and monitoring efforts and security will only complicate matters further. They will likely object to large numbers of inspectors on the ground or highly intrusive monitoring methods that may be required to achieve monitoring with a high degree of confidence.

The Alternative of Probabilistic Verification

Probabilistic verification embraces the fact that well-informed expert judgment is an integral part of a verification regime. Rather than only considering activities that can be monitored with a high degree of confidence and making compliance decisions only based on those monitoring systems, probabilistic verification seeks to assess compliance with the whole of an agreement by considering and assimilating all sources of information, even those that may be of a low or intermediate degree of confidence. Assessing all available information builds context for compliance decisions, creating the flexibility to verify complex agreements.

The narrow approach to verification has a strict requirement for high confidence monitoring of all activities that are covered by an agreement. If an activity cannot be individually monitored with sufficient confidence, it may be excluded from an agreement or the agreement may be contorted to fit the capabilities of the requisite monitoring systems. For example, the 1987 Intermediate-Range Nuclear Forces Treaty was negotiated to blunt the proliferation of intermediate-range, ground-launched nuclear missiles. However, monitoring systems were not capable of distinguishing nuclear and non-nuclear weapons of the same range capability without a politically unacceptable level of intrusiveness. As such, the agreement banned both nuclear and non-nuclear intermediate-range weapons and ultimately fell apart some years later after Russia developed and fielded an intermediate-range, ground-launched cruise missile. Building agreements around verification capabilities can undermine the effectiveness and durability of the agreements themselves.

Instead, agreements should be designed first and foremost with political objectives in mind. If it is valuable to proscribe an activity in an agreement, it should be proscribed even if a country’s compliance on that individual activity cannot be monitored with high confidence. Verification should be assessed at the broader level of the entire agreement. So long as limits are set in such a way that gaining a meaningful advantage by cheating requires evading multiple monitoring systems, then one’s overall confidence that cheating would be detected will exceed one’s confidence in any individual monitoring system.

For example, if, as part of a denuclearization agreement, limits or prohibitions were placed on North Korean warheads, ballistic missiles, and transporter erector launchers (TELs), then reconstituting or increasing the country’s number of nuclear-armed TEL-transported missiles would require cheating on each of these limits. Gaining a meaningful advantage by cheating without being detected would require evading the monitoring of all three activities in concert, an inherently more difficult prospect than evading detection on any single activity (see figure 1).

Assume hypothetically, for instance, that North Korea perceived that the chances of being able to cheat without getting caught was 50 percent on each activity—a coin flip. If that were the case, then Pyongyang’s perceived chances of getting away with cheating on all three activities would be the same as winning three consecutive coin flips (only 12.5 percent). Even if certain activities can only be monitored with low confidence, such an approach still helps increase the overall probability of detecting cheating. Further, by increasing both the probability of detection and the complexity of cheating, cheating can be more strongly deterred in the first place.

Probabilistic verification is a layered, or defense-in-depth, approach to verification. A real-world analogy of such a concept in practice can be found in airport security. Airport security is ensured by layering multiple security measures. The most obvious layers of security at an airport are at the security checkpoint. A passenger’s identity is confirmed with photo identification, their bags are x-rayed, and the passenger passes through a metal detector or millimeter-wave scanner. This layer can be rather porous or easily dismissed as security theater. Millimeter-wave scanners have been known to struggle to detect handguns hidden in a particular way, for example.

However, there are multiple other layers of security, many of which may not be outwardly obvious. When tickets are purchased, they are linked to a credit card and identity that can be checked against no-fly lists. Within the terminal itself, behavior can be monitored either by patrolling police or closed-circuit television cameras. Additionally, there may be drug- and explosive-sniffing dogs present. Each of these layers may be prone to failure by itself, but collectively they can make malfeasance difficult enough that travelers and airport operators alike feel they have a sufficient level of security. This system can also create enough doubt in the minds of would-be perpetrators to have a significant deterrent effect, which strengthens the robustness of the system as a whole.

There is a recognition in probabilistic verification that there is a separation between monitoring systems and verification. While monitoring systems provide information, verification requires a compliance assessment. In such an assessment, all information available from monitoring systems, along with contextual information, is combined with expert analysis and judgment to reach a decision on whether or not a given agreement as a whole is being complied with. This process would mirror conventional intelligence assessments, a process for which there is much experience and expertise to draw on. Such assessments are implicitly part of any verification process, but probabilistic verification makes this element of the process explicit to better clarify the assumptions underpinning monitoring and verification.


Probabilistic verification provides a framework for approaching the verification and monitoring of complex and difficult agreements. The traditional approach to verification would be hindered in the case of North Korea due to both the variety of items that would need to be included and the potential for limited confidence in the requisite monitoring systems. By contrast, probabilistic verification can handle—and make use of—monitoring systems that can only achieve low levels of confidence. This does not mean that high confidence monitoring should not be sought wherever possible or that an overall low degree of confidence in the ability to detect a breach of the agreement should be accepted. It only means that probabilistic verification has the flexibility to allow political goals—not monitoring systems—to shape a future agreement with North Korea.