Extensibility – Matthew Exon Fediverse

Can someone explain to me how it makes any sense for Pixelfed to be a thing that has its own accounts, rather than being an ActivityPub client whose GUI just happens to be image-focused?

This is a question about extensibility, and how that is supported or not by the current crop of ActivityPub software.

Pixelfed, it turns out, has a handful of special features, and these require special data. But it’s hard to understand why this is so essential for Pixelfed, so let’s not talk about that. Instead let’s take the example Evan Prodromou uses in his book: a chess-playing client.

I’ve developed a phone app called APChess. It works alright as a basic ActivityPub client: Alice can follow people, post updates, mention people to start a conversation, that sort of thing. But it’s USP is that, instead of sending Bob a message, she can start a game of chess. The chessboard is displayed prettily in APChess. But when she makes a move, it sends that as an ActivityPub message. Then Bob can make his move, and the result comes back as another ActivityPub message.

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {
      "chess": "https://chess-world.test/schema"
    }
  ],
  "id": "https://chess-world.test/users/alice/statuses/113844248096003446/activity",
  "type": "Create",
  "actor": "https://chess-world.test/users/alice",
  "published": "2025-01-17T14:49:27Z",
  "to": [
    "https://www.w3.org/ns/activitystreams#Public"
  ],
  "object": {
    "id": "https://chess-world.test/users/alice/statuses/113844248096003446",
    "type": "chess:Move",
    "published": "2025-01-17T14:49:27Z",
    "url": "https://chess-world.test/@alice/113844248096003446",
    "attributedTo": "https://chess-world.test/users/alice",
    "to": [
      "https://social-chess.test/users/bob"
    ],
    "chess:from": "e4",
    "chess:to": "e6"
  }
}

To make this work I had to add a new object type “Move”. And a move contains two fields: “from” and “to”. This looks rather different to a normal status update, which instead has HTML in a “content” field:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams"
  ],
  "id": "https://chess-world.test/users/alice/statuses/113844248096003446/activity",
  "type": "Create",
  "actor": "https://chess-world.test/users/alice",
  "published": "2025-01-17T14:49:27Z",
  "to": [
    "https://www.w3.org/ns/activitystreams#Public"
  ],
  "object": {
    "id": "https://chess-world.test/users/alice/statuses/113844248096003446",
    "type": "Note",
    "published": "2025-01-17T14:49:27Z",
    "url": "https://chess-world.test/@alice/113844248096003446",
    "attributedTo": "https://chess-world.test/users/alice",
    "to": [
      "https://social-chess.test/users/bob"
    ],
    "content": "<p>LOL 😜</p>"
  }
}

But that’s two clients, both phones, both running APChess. For this to work, each needs to connect to a server. In general, Alice and Bob are on different servers. Alice sends her move to her server, which forwards it on to Bob’s server. And then, maybe much later, Bob’s phone will connect to his server and retrieve the message. So Bob’s server also functions as a data store.

The problem with developing APChess today is that both servers are probably running Mastodon. Mastodon acts as a data store. But it stores data in relational tables with a fixed schema. There is no place to store the “from” and “to” of the chess move. That information effectively gets stripped off on its way from Alice to Bob. And what’s left? An empty, useless message.

So in practice, if I develop the APChess phone app, it’s only useful if I also develop my own APChess server software. I can then build the database schema around the data I want to store. That’s a lot of work. And the resulting server will be far inferior to Mastodon for anything except playing chess. Alice and Bob will, in practice, need to have both a Mastodon account and an APChess account, with separate follow relationships. Sure, other people can still follow their APChess accounts from Mastodon. But they won’t see anything interesting.

Email

Compare that to an email server. Email servers usually don’t use a relational database as their primary data store. Instead they store files on disks. The data in those files reflects exactly what the client sent over SMTP. The data has headers and a body. Most of the headers are added by the server automatically, but it only adds headers to what’s already there. It doesn’t care if there are headers that it doesn’t understand. So clients can, and do, use their own headers to record custom metadata about the email.

So I wouldn’t use ActivityPub, I’d use email. To get the result my users want, a better approach is to develop a phone app called SMTPChess. Alice can use it to exchange simple emails with Bob. But as a special feature, instead of sending an email, she can start a game of chess. When she makes a move, that’s encoded as custom email headers.

To: Bob <bob@social-chess.test>
Subject: New chess move from Alice
Date: Fri, 17 Jan 2025 14:49:28 +0000
From: Alice <alice@chess-world.test>
Message-ID: <00c8b62bd766e335ae3242b3a0092293@chess-world.test>
X-SMTPChess-From: e4
X-SMTPChess-To: e6
Content-Type: text/plain

This email is best viewed in SMTPChess.

This is much better because Alice and Bob can use their existing mail accounts. In fact, SMTPChess is kinda clumsy as a general email app. Alice and Bob can use Google’s email app, or K9mail, or whatever they want side-by-side with SMTPChess. Those apps will see chess move emails in the inbox, and not be able to display those properly. But Alice and Bob understand what they’re about and are happy to switch to SMTPChess when they see them.

Network Effects

The problems APChess faces in the Fediverse are an example of network effects. We’re all extremely familiar with network effects on the Internet. In fact they rule our online lives. Metcalfe’s Law says that the value of a network is proportional to the number of people using it. This leads to natural monopolies and all sorts of undesirable economic and social consequences.

But I think this misses something. I would argue that there are two, and exactly two, kinds of network effect. In the same way that there are exactly two kinds of infinite sets, one much larger than the other, one kind of network effect is much stronger than the other.

The first kind of network effect is when Alice and Bob each actively decide they would like to play chess. The more people who want to play chess, the more valuable chess is as a game. This is the picture you get if you google “Metcalfe’s Law”.

The basic image of network effects from Wikipedia, with successively 2, 5, and 12 telephones.

But this type of network effect is quite weak. If Alice and Bob are the only two chess players in the world, that’s enough for a game if they are both motivated. It’s not like they need to swear off all kinds of interaction with other human beings entirely to do so.

The second kind of network effect is when Alice and Bob want to play chess, but can only do so via a public intermediary that we will call X. And the efficiency of X scales with the number of people using it. In this scenario, by far the easiest way for Alice and Bob to play chess is to use the same X as everyone else in the world. But if X doesn’t support chess, what options do they have? They can try to set up their own personal alternative to X, but it will be expensive. Alice and Bob will be strongly motivated to just find a different hobby.

The first kind of network effect we might call the Microsoft network effect. In the 90s everyone needed Microsoft Word. That was because if you wanted to send a document to someone, you could rely on them having Microsoft Word, so that’s what you used. That was enough for a monopoly to form. But not a complete monopoly. In this period I used Abiword. It was fine.

The second kind of network effect is the Facebook network effect. And as big tech moved relentlessly into the cloud, that’s the world we all became used to. It’s no longer enough for two people to choose how to get things done. Now the communication is dependent on the choices of a middle-man shared by the whole of the rest of the world. People lost the habit of getting real work done on their own computer, and sharing data directly with others on their computers. Instead we sent our data to Facebook, and relied on Facebook to pass it on where we really wanted it to go. When I switched off Facebook, all I had was Friendica. It was not fine.

Email is an artifact of the earlier era. The network effects of email are still powerful – everyone needs an email account. But it was built with a perhaps-unconscious assumption about how the world works. There is no single central authority, just a widely-dispersed set of nerds. Of course email had to be peer-to-peer, federated, and extensible. That provided an opportunity for extending email for particular applications, or replacing it entirely. And that’s the opportunity that SMTPChess has which APChess lacks.

Clients and Servers

Mastodon’s dominance of the Fediverse is an example of a Facebook network effect. You can communicate in ways outside the small subset supported by Mastodon, using platforms like WordPress and Ghost instead. But most of the audience is on Mastodon, and people are hungry for reactions. It’s just more rewarding to trim things down to what Mastodon supports and post it once. That’s why Alice is connected to far more people on Mastodon than she is on APChess, and so is Bob. That makes it astronomically more likely that Alice and Bob will be connected on Mastodon than on APChess, even if they both use the app.

If APChess was an app that could communicate through Mastodon, things would look very different. Alice and Bob are connected on Mastodon. Their profiles both say they’re interested in chess. It’s easy and natural for Alice to open up her APChess app and start a game with Bob. It’s only Alice and Bob’s choices that matter, not Mastodon’s. And that would turn this into a Microsoft network effect instead of a Facebook network effect.

We all remember the Microsoft monopolies of the 2000s. Life was great in the 2000s. We all want to go back to that right? Yes. Of course we do.

A nerd with long shaggy red hair perched in a deep window frame in an apartment, looking pensively out at the sunny world outside. It was actually really uncomfortable sitting like that and I only did it because I thought it would make a cute selfie. — The author, 2005. “Lines of light ranged in the nonspace of the mind. I wish you could see it.”

This is why this problem of Pixelfed feels really important to me. Microblogging is not remotely what I want from the Internet. I want articles not tweets. OK, that’s just me, fine. But it seems to me that microblogging isn’t what most other people want either. Twitter never came close to being as popular as Facebook. Partly that’s because Facebook just does so much more stuff than Twitter ever did, with features like events and groups, and these things really do matter to people.

The Fediverse ended up in a place where all my contacts are Mastodon contacts, even though I don’t use Mastodon myself. We all know that ActivityPub is capable of so much more. But in practice any other application would involve setting up an entirely parallel social network. And if that was easy, we’d have defeated Facebook by now.

Extensible Fediverse

So what would we have to change about the ActivityPub standard to make it work more like email?

Nothing.

ActivityPub messages are defined by ActivityStreams. ActivityStreams messages are built on a generic structured data format called JSON-LD. JSON-LD is like JSON, but much harder to use in basically every way. My examples above were stripped down. Here’s a real message. You have my permission to scroll straight past it.

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {
      "ostatus": "http://ostatus.org#",
      "atomUri": "ostatus:atomUri",
      "inReplyToAtomUri": "ostatus:inReplyToAtomUri",
      "conversation": "ostatus:conversation",
      "sensitive": "as:sensitive",
      "toot": "http://joinmastodon.org/ns#",
      "votersCount": "toot:votersCount"
    }
  ],
  "id": "https://mastodon.social/users/jwz/statuses/113828967578558803",
  "type": "Note",
  "summary": null,
  "inReplyTo": null,
  "published": "2025-01-14T22:03:25Z",
  "url": "https://mastodon.social/@jwz/113828967578558803",
  "attributedTo": "https://mastodon.social/users/jwz",
  "to": [
    "https://www.w3.org/ns/activitystreams#Public"
  ],
  "cc": [
    "https://mastodon.social/users/jwz/followers"
  ],
  "sensitive": false,
  "atomUri": "https://mastodon.social/users/jwz/statuses/113828967578558803",
  "inReplyToAtomUri": null,
  "conversation": "tag:mastodon.social,2025-01-14:objectId=894518844:objectType=Conversation",
  "content": "<p>Can someone explain to me how it makes any sense for Pixelfed to be a thing that has its own accounts, rather than being an ActivityPub client whose GUI just happens to be image-focused?</p>",
  "contentMap": {
    "en": "<p>Can someone explain to me how it makes any sense for Pixelfed to be a thing that has its own accounts, rather than being an ActivityPub client whose GUI just happens to be image-focused?</p>"
  },
  "attachment": [],
  "tag": [],
  "replies": {
    "id": "https://mastodon.social/users/jwz/statuses/113828967578558803/replies",
    "type": "Collection",
    "first": {
      "type": "CollectionPage",
      "next": "https://mastodon.social/users/jwz/statuses/113828967578558803/replies?min_id=113836877414089315&page=true",
      "partOf": "https://mastodon.social/users/jwz/statuses/113828967578558803/replies",
      "items": [
        "https://mastodon.social/users/jwz/statuses/113829061261774316",
        "https://mastodon.social/users/jwz/statuses/113829068325141705",
        "https://mastodon.social/users/jwz/statuses/113836877414089315"
      ]
    }
  },
  "likes": {
    "id": "https://mastodon.social/users/jwz/statuses/113828967578558803/likes",
    "type": "Collection",
    "totalItems": 160
  },
  "shares": {
    "id": "https://mastodon.social/users/jwz/statuses/113828967578558803/shares",
    "type": "Collection",
    "totalItems": 56
  }
}

Much of that is fairly readable. It’s verbose mostly because the IDs are absurdly long URLs, but you get used to that. What stands out though is the weirdness of @context. What’s that doing there?

The answer comes down to our chess structure. We want to have fields “from” and “to”, and the type should be “Move”. But that’s for APChess. Someone out there is developing APXiangQi. They also have type “Move”, and “from” and “to”, but instead of “from” being “e4” it would be “32”. That’s horribly confusing! Imagine the bugs encountered when a user is playing chess and xiangqi on the same account.

So we have to namespace these identifiers. And since this is the whole world on one network, we might as well use URLs as namespaces. This is not just guaranteed unique, but it also provides a handy place to store the detailed schemas of these data structures. So we have https://apchess.test/schema and https://apxiangqi.test/schema. Then one uses https://apchess.test/schema#from and the other uses https://apxiangqi.test/schema#from. Bulletproof.

The problem then is that all of these URLs really clog up our JSON. We can probably cope with ridiculously long IDs, but ridiculously long keys is just silly. So, to make it a little more manageable, all the common URLs are stored in the special @context element. We define chess: "https://apchess.test/schema", and then we can talk about chess:from. This should make people slightly less angry at us.

Unfortunately, people still get extremely angry anyway.

What did we gain? What we’ve achieved here is the ability to mix and match schemas. A single message can contain a core set of common fields that even Mastodon can understand, while also carrying multiple rich data sets for other applications. And those applications can share common structures instead of redefining the wheel. APChess can recycle chess notation from schemas defined for other purposes, automatically gaining compatibility with a whole ecosystem of tools. But this mixing comes at the expense of making every developer unhappy and angry when they try to wrestle with all this @context nonsense. We would never have done this unless infinite extensibility was explicitly the core goal of the exercise. But that’s what we wanted and we forced everyone to live with it.

And then they stuck the data in a relational database and threw away the bits that don’t fit. Fuck.

Excuses

I use Friendica, occasionally I submit patches upstream, it’s the system I’m most familiar with. Friendica does exactly that dumb thing I complained about. And I get why.

When I say I use Friendica, I mean I’ve been using it since 2011. The Fediverse is generally considered to have started in 2008. Whereas the ActivityPub spec was published in 2016. Which is to say, ActivityPub is still the new kid here, fumbling around trying to figure out how things work.

Friendica’s original USP was that it connected to everything: Facebook, Twitter, WordPress, plus anything that had RSS, plus even email. All of this was presented with a single consistent interface. With that many different data sources, you have no choice but to impose order on it all. Ultimately all of these messages kinda look the same, with an ID, a title, a body, and a set of recipients. The Friendica devs just decided what’s a “good” structure, and bashed everything else around until it fit into that structure.

Friendica is not at all ready for a world where everyone uses the same underlying rich data structure, but the content widely varies.

As it happens, Mastodon dominates the Fediverse, and it has an even simpler structure than Friendica. And, like Friendica, Mastodon predates ActivityPub. Everyone bends over backwards to be accessible to Mastodon users, so we naturally converge on a common subset of data fields. This fits Friendica well, and every other “alternative” implementation. And we arrive where we are now.

But these two platforms are doing exactly the thing I wish would happen more: first build a community, and only as a second step connect to the Fediverse. The value of a community is its culture. That culture can only develop if it is somewhat isolated. It need not be entirely quarantined, but soft barriers are good. And a community that evolves in relative isolation will be supported by software with an opinionated schema.

So I don’t think Mastodon or Friendica did anything wrong, as such. But I would like to see some new platforms that are more open. And over time, I would like general-purpose platforms like Mastodon and Friendica to take the hint and find a way to be more open as well.

How

To get back to the original vision of ActivityPub, we need servers that store the full JSON-LD data rather than only the part that fits into a schema. The easiest way to do this is just to store the JSON-LD data as files on disk. In fact, this is so much easier than setting up a database in the first place, that when Terence Eden set out to build the smallest ActivityPub server he could, that’s exactly the approach he took. As a result, his tiny server is in many ways far more capable than Friendica.

It is, however, not efficient. So probably some kind of database is required. I guess this is this “NoSQL” thing that the kids talk about these days. But I would note that JSON-LD is actually just RDF wearing the skin of a recently-slaughtered millennial. Which means a triplestore is the natural home of this data. I’ve never heard of anyone ever building a triplestore that was useful. I once spent months trying to build one myself, and came as close as I ever have to needing an intervention in the process. But perhaps this long-dead and much-cursed idea is due a revival.

Pepe Silvia meme of Charlie Kelly looking manic in front of a corkboard covered in scraps of paper connected by red tape. — Don’t you see, *the schema is also data*!

Or Not

There are plenty of voices on the Fediverse arguing that the current approach really is for the best. It causes huge friction when people post content not supported by Mastodon. Even different Mastodon instances have different character limits, and users on instances with smaller limits get angry when people exceed those limits. They’re not wrong! Verbosity is a curse. Emacs tells me this buffer is now at 3262 words. This article was a mistake.

Civilisation is compromise. I’m typing this in English, after all. Perhaps standardising on Mastodon-style microblogging produces compelling enough benefits that it’s worth throwing away the extensibility. And perhaps the governance of Mastodon is robust enough that we can safely entrust the schema to that platform, with the option of running a parallel network for the applications that just don’t fit.

But to me, this feels like a badly missed opportunity to take advantage of the design we have today. And it certainly feels like a massive fucking waste of my time making me bend my head around JSON-LD if we’re not even gonna use it.