Why implementing ActivityPub is hard, and why it doesn't have to be
33 points by hongminhee
33 points by hongminhee
This is why so many activitypub projects are forks of one another: it's easier to figure out someone else's approach than to code it all up yourself. What the author proposes is not so different from a usual fork of misskey or pleroma one would find in the wild: the library has its opinions and approaches and doesn't seem to give you much control. But at least it doesn't also force its UI on you, like forking a full server does
As a person also working on implementing AP, for me the hardest part is: there is no good way to use JSON-LD. If it were easy to convert to a canonical representation of an object, all the interactions would fall out of it. Using it as a real linked document is too inefficient, and using a document as a raw json means death by a million corner cases (so far I chose the approach two and died)
If it were easy to convert to a canonical representation of an object, all the interactions would fall out of it
That is of especial note given the use of signatures. Back in the day XML canonicalization existed for this very signature problem: ensure that the byte serialization on the receiver matched the byte serialization of the sender. It's not apples-to-apples in a JSON-LD world, but it's not completely orthogonal, either
However, I think quite a few JSON adjacent technologies suffer from the problem you're describing, since there are a lot of roughly equivalent JSON Schema representations for the same logical schema, and that makes interacting with anything that is JSON Schema adjacent, which includes the absolute horror-show of OpenAPI schema that is close-but-not, just comically terrible not even counting the number of draft releases of schema
If it were easy to convert to a canonical representation of an object, all the interactions would fall out of it. Using it as a real linked document is too inefficient, and using a document as a raw json means death by a million corner cases (so far I chose the approach two and died)
I've been thinking on how to implement a AP server, but haven't started so take this with more than a grain of salt. One thing that might help was to separate the application into smaller services, leaning into the Actor model more to present them as a 'unified' interface☨. One example would be to learn from the email servers MTA vs MUA split.
The AP 'MTA' service takes care of sending messages out of outboxes and receiving messages for its inboxes. The JSON-LD documents are more or less blobs as far as it cares. There is some parsing required to determine the sender/recipient but not much other than that. The storage could even file based, like go-ap uses FWIR (~mariusor please correct me if I'm wrong)
The AP 'MUA' is application. The one that actually needs to understand JSON-LD semantics. Probably using something like PostgreSQL to store the documents as jsonb + generated columns & views to present the data in an SQL friendly way. That way we can decide the best presentation for the document depending on the Object's type.
☨: Another example the search service could be modeled as an actor, that returns the results as an ephemeral outbox.
The AP 'MTA' service takes care of sending messages out of outboxes and receiving messages for its inboxes.
That's how generally GoActivityPub is meant to work, albeit in library form, with whatever other logic a developer wants to add on top of that.
Basically the library can process the incoming activities and as an end result, the calling code can receive a fully populated ActivityPub object from which they extract whichever information they find useful and do with it as they will.
PS. The library has multiple storage backends, the filesystem one is probably the more robust one, as I use it for dev tests, but there are sqlite, postgres and a couple of KV stores available.
Invaluable list of idiosyncrasies and their mitigations for various implementations.
Sadly, I haven't implemented at least half of them in GoActivityPub. :)
This post started out as a technical one, and I was grateful, but then seemed to pivot to pitching their framework, which felt less pleasing to read
I am happy that a subset of the world who happens to be using typescript can hopefully not have to rediscover all of those implementation quirks. However, in my mental model if there was a record of "oh, hey, if $context and $circumstance then $outcome but there is $bugfix" (one may think of them as bugs, although I don't mean it as "bugs in fedify" but rather "bugs fixed by fedify") then other non-typescript situations, e.g. the sibling GoActivityPub author, could benefit from all the hard fought battles. This post touched on several of them, but is a fixed point in time, versus the project which seems to be trying to capture all such interoperability bugs over time
The current alternative, as best I can tell, would be to read through every non-human commit message in hopes of distinguishing those two classes of bugs I mentioned (filtering out fedify bugs from interoperability bugs)
The project choosing not to do that bookkeeping, even in the presence of what appears to be an "all in" on AI repo is especially ironic given that the whole spiel that I have heard about LLMs is that they automate drudgery. If so, fantastic: have Claude make a GH issue, or better yet a .md file in the repo, documenting the observed outcome and then how fedify fixes it. It has its own debugger and whatever 'best practices' means so it should be right up their alley
This post started out as a technical one, and I was grateful, but then seemed to pivot to pitching their framework, which felt less pleasing to read
Indeed, it catastrophizes over the smallest of issues and presents them as ActivityPub failures. For example
With five thousand followers, one post means thousands of HTTP deliveries. Do that inline in the request handler and your publish button takes half a minute to respond, or the server falls over. Fine, use a queue.
Why would you perform the requests to 3rd party services inline? Using a queue for this is web application 101. If you need to talk to a 3rd party service, it goes on a background job. If the information is not necessary to respond to the request, it goes on a background job. Any problems you run into by doing the requests in the request handler are strictly a "Step on a rake, get hit on the face" problem. And have nothing to do with ActivityPub.
Deliveries fail, so retry them. On what schedule? Exponential backoff. How many times? And is a 500 Internal Server Error the same kind of failure as a 410 Gone?
So regular web application development problems? Issues that are unrelated to ActivityPub, that one runs into when making requests to 3rd party services in Job queues. And for the most part web application frameworks have reasonable defaults. Only at the what kind of error did we run into do we need start deciding on whether or not to retry. And retrying on a 410, although wasteful, is not an issue that one needs to tackle urgently. It does create more memory pressure on your job queue but its not something that is likely to take down your application in a couple of hours.
see whether it gets rejected, re-sign with the other, and remember per server which one worked
What am I reading!?! Is this why Mastodon development is slow?
one post means thousands of HTTP deliveries
Ah yeah, in Ruby the famously capable language for network systems programming and queuing.
Unbelievable and nice to have it all wrapped in a library but still…
After implementing ActivityPub in Java I reached the conclusion that server-to-server protocols like this are better off just building on top of git. So much of the complexity is there to solve problems that git already solves better. If you model this as JSON documents in a git repo, you no longer deal with pagination (the protocol ensures you only are sent the data you don't have), you get commit signing, you get guaranteed order of events (a problem mentioned in this post), you get history for free, and so on. You could probably make an analogue to Greenspun's tenth rule about these protocols containing a buggy, slow implementation of half of git.
Git isn't a great choice because of the history depending on parent commits. A merkle tree gossip protocol that works off a similar negotiation strategy would work well, though.