Photo of social media apps Threads, Mastodon, TikTok, Discord, X, and Reddit on a phone screen.
Source: Getty
article

New Paradigms in Trust and Safety: Navigating Defederation on Decentralized Social Media Platforms

Defederation on decentralized social media offers new possibilities for online governance. Experts consider how it can be used responsibly, in a way that balances speech and safety.

by Samantha LaiYoel RothRenée DiRestaKate KlonickMallory KnodelEvan Prodromou, and Aaron Rodericks
Published on March 25, 2025

A Tale of Two Cities

Imagine two cities on opposite sides of a river, connected by a bridge but governed by different laws, values, and norms. Now imagine that the government of either city has the ability to blow up the bridge and take unilateral action to cease all engagement between citizens of the cities at any time. How should a government acting in the best interests of its citizens wield this immense power to blow up the bridge? Are there any circumstances under which doing so would be acceptable—such as a perceived threat of violence from the opposing city? How should they weigh the risks and benefits of doing so?

This tale of two cities represents a physical approximation of a governance challenge inherent in a new breed of decentralized social technologies. On decentralized social media, the usual ways of imagining internet governance are entirely broken down and reconfigured. While centralized social media platforms are typically moderated by a single central authority (almost invariably the same company responsible for developing and operating the service), decentralized social media is not owned or run by any single organization. Instead, it consists of an ensemble of over 18,000 federated servers that are independently hosted and operated but can still communicate with one another, as long as they use the same communication protocol.1 

The essence of federation is interoperability: the ability of a server operated by one entity to seamlessly communicate with a server operated by another, allowing end users to view and interact with remotely hosted content. However, this doesn’t mean that every possible instance of a federated platform is able to interoperate with each other: those that engage in harmful behavior can be cut off. On decentralized social media, federation comes with the option of defederation, a moderation function that allows a server to cut off communication with another server entirely. Defederation functions as a two-way block, so that if Server A defederates from Server B, no user on Server B can see or interact with content on Server A and vice versa. Typically, it is used to manage larger-scale online harms, like spam and coordinated manipulation campaigns. Smaller-scale harms can be addressed through measures against specific actors (through blocks, for example) or content (through labels, visibility restrictions, or deletions).

Defederation functions both as a decision individual servers can make for themselves and as a form of collective action. In July 2019, the social network Gab joined Mastodon, the biggest decentralized social media platform at the time. Following that, a majority of existing Mastodon servers quickly defederated from Gab, citing an opposition to the platform’s philosophy, “which uses the pretense of free speech absolutism as an excuse to platform racist and otherwise dehumanizing content,” as described by Mastodon’s founder Eugen Rochko.2 While some Gab users expressed initial excitement toward spreading their content across Mastodon, mass defederation left them unable to communicate with users on most servers.3 Thoroughly isolated, Gab defederated from Mastodon months later. Defederation was used by members of the Mastodon network to attempt to protect the service from a perceived threat—in this case, successfully.

More recently, following Meta’s announcement that it intended to integrate the Threads microblogging product with the fediverse (a portmanteau of the words federation and universe) through ActivityPub, over 800 servers joined the Fedipact and preemptively promised to defederate from Threads when it joins the fediverse.4 With this pact, signees sought to express their disapproval toward Meta’s presence in the fediverse and use their signatures to declare that Meta’s approaches to moderation and privacy did not align with theirs.

The past two years have seen an influx of large actors joining the decentralized social media arena. Bluesky, which first became publicly available in February 2024, now has more than 33 million users on its platform.5 And with Meta’s integration of Threads into the ActivityPub protocol, the 300 million users of Threads can now opt in to sharing their content with the roughly 14 million users on the fediverse.6 As decentralized social media platforms grow in scale, so too do the risks of online harms and the challenges of moderation. How should defederation decisions be made to balance safety, community accountability, and free access to speech?

In June 2024, the Carnegie Endowment for International Peace conducted a workshop with eighteen experts to explore governance challenges to defederation. Through this workshop, experts analyzed the possibilities afforded by decentralized social media and how defederation as a mechanism shifts existing frameworks of governance. In addition, they laid out practical considerations behind how these decisions are made, analyzed the complex challenges administrators and moderators currently face, and provided recommendations on what commercial servers such as Meta could do to support defederation decisions within the ecosystem.

Creating a More Open Internet

The internet was built on principles of openness, promising a digital environment where information would flow freely and users would have the autonomy to communicate without boundaries.7 In practice, things have played out differently: the emergence of Web 2.0 platforms left the internet, as most people experience it, walled off within distinct platforms. Over the years, concerns around issues such as data protection, algorithmic bias, the concentration of economic power, and the lack of transparency on centralized social media platforms have sparked discussions over how existing models of internet governance could be improved upon.8 This has, in turn, created a new opportunity—a recognition that a return to the internet’s original openness could offer significant public benefits.

A key rallying cry for proponents of a more open internet has been, in the words of Mike Masnick, to build protocols rather than platforms.9 Advocates argue against having individual proprietary platforms walled off from each other. Instead, platforms could be built or connected through open communication standards and instructions that others can use to build compatible, interoperable systems. This approach is not dissimilar to the open nature of the early internet. Email, for example, allowed different servers to seamlessly exchange messages through the Simple Mail Transfer Protocol, forming a network that transcends an individual company or service. 

Decentralized social media is home to an ensemble of federated servers that provide a range of functions, including Hachyderm.io, which runs on the Mastodon microblogging software; Flipboard, a social news service; WeDistribute.org, a WordPress blog; and podcasts.cosocial.ca, a social audio service that runs on the Castopod software. Servers are independently hosted and operated but can still communicate with one another, so long as they use the same communication protocol.10 The fediverse refers to services that use the W3C’s ActivityPub standard, including software such as Mastodon, Threads, and PeerTube, among others. Other communication protocols include AT Protocol (on which Bluesky operates), Nostr, and Farcaster.11

Decentralized social media provides a platform to experiment with more democratic forms of online governance. In Governable Spaces, Nathan Schneider describes the implicitly feudalistic nature of social media platforms, where content moderation occurs on a top-down basis from a centralized source. Decentralized social media, conversely, creates the potential for metagovernance, making it easier for people to create and curate their own online communities and to “root out feudal practices and remake them as commons.”12

Proponents of decentralized social media point to how these technologies increase users’ choice and control over their own experience. Individual users have the freedom to choose across a range of approaches to governance and moderation. On Mastodon, for example, the network includes around 9,000 servers, virtually all of which are run by volunteer administrators and moderators that develop and enforce their own rules.13 These rules may be more or less permissive, based on the preferences of the community. Users can select from among these options when setting up an account—inheriting that server’s governance choices as a result. On Bluesky, users can further customize the content and accounts they see with composable moderation, which allows users to create their own moderation labels or subscribe to moderation labels created by other users.14

User choice also comes in the form of account portability: people are not locked in to maintaining a presence on one specific server, and in some cases, they can bring their posts or followers with them if they move to another server. For example, on ActivityPub, someone who is unhappy with a particular server because the moderation decisions are too permissive or too restrictive can move their social graphs to another server. With the new LOLA standard, they may also be able to bring their content with them.15

Trust and Safety in Decentralized Systems

Although users are more empowered to choose from among a wider variety of options, the execution of content moderation policies operates under a different set of constraints in decentralized spaces. Trust and safety in decentralized spaces can be challenging given platform design and resourcing limitations. Past research has shown that decentralized social media platforms face significant constraints in addressing online harms such as influence operations, spam, and child sexual abuse material, among others.16 Many servers have basic trust and safety capabilities such as user blocks, and post visibility restrictions and takedowns. However, there are few tools available for volunteer-run servers to address harmful content in bulk. Moderators must screen and remove content post-by-post. Thus, enforcing an instance’s policies can place a significant burden on the time and resources of moderators and administrators, and they may be easily overwhelmed by large-scale spam and harassment campaigns.

Behavioral threats, such as spam and coordinated manipulation campaigns, are hard to detect because of the architecture of the decentralized space. Harassment, which is by nature already hard to define, becomes even more difficult to detect and consistently enforce against. Information that is available to centralized social media platforms—like email addresses, phone numbers, IP addresses, and click paths—is unavailable to decentralized networks. Administrators and moderators also lack visibility over the larger network and only have access to user and activity logs specific to their own instances. As a result, threats spread out across multiple instances become harder to spot. The porous nature of decentralized social media makes these security risks a collective concern, regardless of server size.

The Decision to Defederate

Defederation is a novel method for conducting moderation at scale. The governance principles behind its use are unique in two ways. First, defederation gives administrators and moderators the ability to precisely define the boundaries of networks their users have access to. This is different from centralized social networks, all of which have predetermined boundaries based on who uses a given service. A moderator of a Facebook group, for example, only has control over what a user posts within the group. On decentralized services, however, a moderator becomes responsible for determining what content their larger community does and does not receive, expanding both the power and responsibility they have over setting these boundaries.

Some servers have found interesting ways to experiment with boundary-setting. An example of this is allowlist federation, which inverts the paradigm of denylist defederation common in the fediverse. Instead of deciding who they won’t federate with and allowing all others, administrators and moderators develop lists of servers they would federate with and deny all others. These servers federate with a much smaller number of other servers, but may be well-suited for users who value privacy and closed communities.

Another unique aspect of defederation is that it reframes existing models of accountability, making servers responsible for the overall conduct of their members.17 On centralized platforms, users are accountable only for their own actions, and moderation of those actions typically does not impact others in the network. On decentralized services, meanwhile, users are incentivized not just to mind their own conduct, but the conduct of those within the community they affiliate with.

When is defederation commonly used? Participants from the workshop in June 2024 agreed that defederation is best for handling spam originating from plainly malicious servers. It is also useful for small, volunteer-run servers to reduce the burden of moderation, as one-off measures such as blocks or visibility restrictions require significant time and resource investment. Defederation has also been used as a way to escalate personal disagreements, such as when one administrator argues that another is not trustworthy enough to make good moderation decisions because of their own personal faults.

Balancing Speech and Safety in Defederation

Now, let’s go back to the story of the bridge connecting two cities. This bridge may be valuable for people in these cities who regularly travel between them to work or trade or visit their families. The decision for City A to blow up the bridge might seem easy if everyone in City B regularly launched hostile attacks against City A. But what if half the residents of City B are hostile toward City A, while the other half are peaceful, law-abiding citizens? What if the government of City B is keen to maintain diplomatic relations with City A, but lacks the ability to curb the bad conduct of a portion of its citizenry?

Like many other moderation decisions, defederation as a strategy introduces trade-offs between speech and safety. Defederating from another server lets administrators and moderators protect the safety of their users at potentially greater scale and effectiveness than the moderation of individual accounts and posts would allow—but it also means that users on defederated servers lose access to entire networks of users, some of whom may not be responsible for or even aware of the misconduct that led to defederation. This issue becomes particularly pertinent as commercial servers enter the space. When a small server defederates from another small server, communication stops between the two with relatively few consequences. The networks are relatively small and thus the impact is likewise small. The stakes of a large server such as Threads or Bluesky entering the federated services, however, makes defederation decisions far more consequential, as actions taken by a large server could have bilateral impacts on access for millions of accounts. At the most extreme, defederation from a large server could affect the viability of smaller servers. For small servers that want and benefit from federation with a large server, this could cut off their users from a majority of accounts they interact with.

How should a larger server decide whether to defederate from a smaller one? Take for example a scenario where Server A, with 20 million users, encounters harmful content from Server B, with 5,000 users. Administrators and moderators on Server B do not outright endorse this harmful content, and it remains unclear whether the harmful content exists because they want it to, or because they don’t have the moderation capacity to fix it. In a scenario like this, if they wish to support a multipolar, federated network, Server A could make the tradeoff of spending more time and resources to moderate that harmful content, instead of defederating from it outright. Given that commercial servers have relatively more moderation capacity compared to volunteer-run servers, they should be sparing in their use of defederation as a strategy to protect moderation bandwidth, as this could cause disproportionate harm to users on volunteer-run servers.

Even when defederation is employed as a moderation strategy, participants highlighted the practical challenges associated with scaling its use—principally, the lack of information-sharing across servers. Administrators and moderators, particularly those who volunteer their own time and have limited bandwidth, typically must make decisions ad hoc and based on personal judgment, with little time to revisit these decisions after the fact. They often do not have the resources to conduct more nuanced action. For example, in an ideal world, administrators could tell if a server were an originator or target of a spam wave, and defederate from the originator but not the target. In the status quo, however, some targeted servers may be erroneously defederated from and would not be able to rejoin the network even after the problem has been resolved.

Other than Threads, most servers do not have a process for other servers to appeal if they feel that they have been wrongly defederated from.18 The lack of appeal mechanisms or external auditing means that defederated servers have few means for recourse, or opportunities to refederate with larger networks if they make improvements in their moderation practices. In addition, once a server has been placed on a defederation list, there are few formal means for them to question that decision. Appeals are a major reporting requirement under the European Union’s Digital Services Act, yet these kinds of network-level decisions are not well documented on decentralized networks.19

External resources that guide moderators’ decisions to defederate remain limited. On Mastodon, users information-share in posts using the “#fediblock” hashtag and compile composite directories of denylists.20 However, the hashtag is also at times misused for personal disputes, or by harassers to cast doubt on those who criticize them. At times, servers have used “#fediblock” to plead an appeals case in the court of public opinion. Considering the many motivations behind the use of this hashtag, it can be difficult for a third party to audit why a server ended up on that list. Additionally, these lists typically are not automatically ingestible, placing the burden on moderators to defederate server by server. Threads provides a publicly accessible list of blocked servers, which includes information on whether a server was blocked because it violated Threads’ privacy policy or community guidelines, or because it did not have a publicly accessible feed.21 However, these categories remain general and provide little insight to someone who may agree with only specific parts of Meta’s policies.

Strengthening the Future of User Choice

The fundamental premise of decentralization is that no single entity is responsible for trust and safety on behalf of the entire network. However, large instances such as Threads can help improve collective security by filling existing information gaps, providing publicly accessible, easily ingestible information about servers they have defederated from and the reasoning behind those decisions.

This does not have to mean the recreation of the centralized order in the decentralized space. Instead, defederation lists can be a resource for other administrators and moderators to adopt as they see fit, so they can reserve their bandwidth for more nuanced moderation decisions unique to the communities they host. This way, commercial servers can meaningfully contribute to strengthening the safety of communities in decentralized social media.

Another limitation smaller servers face is the lack of automated tooling, which means that administrators and moderators often have to conduct most moderation manually. Commercial servers can help fill that gap by open-sourcing more automated tools for content filtering and detection, or by providing these tools as a service to smaller servers. An initiative titled ROOST, for example, has been working to help smaller servers identify and curate tools that could support their moderation efforts.22

The value of decentralized social media lies in empowering freedom of expression and user choice. To ensure this, moderation needs to be done effectively to ensure that spaces are moderated to users’ preferences. Commercial servers can uplift this work, complementing the existing efforts of actors like the Social Web Foundation.23 Together, all can play an important role in creating a diverse, sustainable future for decentralized social media.

Acknowledgments

This work was led by researchers at the Carnegie Endowment for International Peace, with support from Meta. Carnegie is wholly and solely responsible for the contents of its products, written or otherwise.

Notes

Carnegie does not take institutional positions on public policy issues; the views represented herein are those of the author(s) and do not necessarily reflect the views of Carnegie, its staff, or its trustees.