honking for fun and profit
It’s been a little while, so a few more notes about ActivityPub implementation, federation, and other odds and ends. There’s no real order to these notes, just things that have come up in the past two months.
honk is slowly approaching completion. The first release tarball of honk 0.1.0 was 127081 bytes. The latest release 0.6.0 is 166970 bytes.
First thing I did when I started honk was to look at a big M note. OK, let’s see, we’ve got some <a> tags, some <span> tags, some custom classes, alright, this is clearly HTML. Grabbed my HTML scrubber off the shelf to keep things from getting too wild, and that’s it, right? Ha!
Turns out big M is super selective about the tags it allows. No bold. No italics. No
code. It never occurred to me that one might simply strip those tags away. Presumably the author used them to convey some meaning, no? honk didn’t let you compose HTML messages because I didn’t have a markdown parser I liked, but if somebody sends me something I think there’s some obligation not to screw up the presentation too bad.
Walked right into the middle of this debate shortly after plugging into the federation. I’m not sure how the endgame plays out, but currently honk allows posting with a limited set of markdown, bold and such, which I think are reasonably useful but also not too likely to cause confusion when mangled by lesser clients.
Fortunately, there are some possible workarounds as well, among them converting honks with more interesting HTML into Articles instead of Notes, which will then appear as links back to honk instead of posts on remote servers.
Dumping everything into the main feedtube started to become overwhelming. So there’s the usual list organization stuff. It works a bit differently than other software, but I think honk combos work pretty well. Just label the people you follow as you see fit.
The overall UI hasn’t changed, but a few things moved around, and element sizing gets the occasional adjustment. The color balance of avatars was redone for better contrast.
honk started by posting everything publicly. Once something is posted, you lose control of it, so the best thing is to never create the illusion otherwise.
However, not everyone approaches things the same way, and they make limited posts. honk has always allowed viewing them, but restricted interaction. This seems kinda unfair though. It should be possible to interact with people, in a not too surprising when it backfires way, instead of self inflicted muting. But this requires some special fixes for followers only posts.
First, a technical note. ActivityPub does not formally define the followers only post. It was just kind of invented later, and it shows. Alice wants to post something like “I’m having a sad day“, which is kinda personal and negative, so she makes it followers only. Bob sees it, and replies “Sorry about your tough day.” Bob’s software is really smart and noticed that Alice’s post was followers only, so therefore Bob’s reply is followers only as well, and thus Bob’s reply gets sent to... Bob’s followers. Who are not Alice’s followers. So now, while they can’t see Alice’s post, they can certainly make some inferences about what was posted based on the contents of Bob’s reply. I don’t know who decided this was the smart thing to do. I am inclined to say it’s not that smart.
I do not wish for honk to be a party to this madness, and so honk does not allow followers only posts. When replying to such, your reply is automatically downgraded (upgraded?) to a direct message. There’s no reason for honk to violate your correspondent’s confidence by revealing their post to your followers. And hopefully this limits your exposure to accidental exposure at their end as well. But nothing is certain.
You can now prefix a post with DZ: to mark it sensitive, allowing discussion of Archer without revealing spoilers. Or, while it’s not the designed purpose, as a setup for a why did the chicken cross the road joke. Anything really. Maybe even just use it as a subject line.
honk 0.1 didn’t support delete because it doesn’t really work and I think it’s very bad for software to lie to the user. Here’s the problem. You post something. It floats around in the ether. You delete it. Where does the delete go? Not nearly everywhere the initial honk went. Consider somebody reading honk via RSS, possibly even with rss2email. How is honk going to reach into their mailbox to delete a post?
Unfortunately, people really want deletion to work, and most perniciously, it often seems like it does. Post on this instance, login to an alt account on another, delete, observe disappearing status. Of course that works. The delete message does federate to the places you know about. The trouble is all the places you don’t know about. A ten second glance at my access logs reveals that there are a lot of people with copies of honks that I don’t know how to contact, and who will not receive deletion notifications.
Nevertheless, people want it to work, and I’m not interested in running an archive. Deletion is just a sped up version of expiration. No harder to actually delete something than to mark it deleted.
Alas immediately after implementing delete, I regretted it. Somebody posted an interesting link, but to a paywalled article, and perhaps regretting polluting the free federation with such, deleted their post. But I’d already seen it! I just hadn’t opened it. But now it’s gone, and I’m left with the knowledge that there’s something out there, it’s interesting, and I haven’t read it. So now I’m suffering from FOMO link anxiety and have to open everything as soon as I see it.
This leaves the question of what to do about content that’s not deleted. How long to keep it? I settled on 30 days as long enough to allow threads to run their course. And this way, when we inevitably miss out on delete notifications, we’ll still purge it eventually.
One’s own personal posts are retained since there’s no way to get them back if they are deleted. Still deciding to what degree they should be accessible.
There was recently a bit of a fuss about archiving sites and search engines, etc. By design, honk doesn’t reveal any content received from other instances, nor allow searching. I don’t want to be a vector for people to access information that isn’t mine.
What about content we never wanted to see in the first place? I haven’t had much trouble with objectionable content. There’s some social reasons for this which you can puzzle out for extra credit, but also some technical reasons. honk is designed to only show me what I’ve signed up to see.
So about 10% of the bad stuff was the result of following a link or thread, but 90% was the result of somebody posting a screenshot and saying, hey look at the horrible thing this horrible person said. Sometimes with a thrilling followup thread debating whether the term in question was really a slur or just slang. No thank you. I wish to unsubscribe from the horrible people digest.
Case in point: somebody from Twitter joined the Mastoverse, and it created a giant tizzy, like the sudden emergence of a fifth Beatle, except in a bad way. I would have been happily unaware of this except echoes of complaints, and complaints about the complaints, permeated even into honk space. Stan, if you’re reading this, you have to go back to Twitter because people who are followed by people I follow are posting about it and it’s filling up my feedtube.
I’m skeptical that there are purely technical solutions to social problems, but I think the way we design technology determines whether they magnify or minimize those problems. If there’s an unfiltered firehose pointed at your face, you’re going to get crap in your eyes. As a tube builder, we get to choose what velocity and pressure we spray at the user.
There are lots of fun edge cases that come up when communicating between implementations. This could be used to create posts that appear or federate in unusual ways.
You can get a feel for the fact that there have been waves of development, and migrations from ostatus, based on the way some programs build URLs and identifiers. honk generally does things pretty simply, exposing ActivityPub objects with minimal abstraction or redirection.
Big M treats the summary field as plain text, while other implementations concatenate it with content (and then apply HTML scrubbing). So you could put <!-- in the summary and conclude content with --> to create a post with text that’s only visible to big M.
When an origin server publishes an activity, it sends a Create to other servers. They then save that object in their database. When somebody shares that post, an Announce is sent out (from the sharer to their followers). If the recipients haven’t seen it yet, they will fetch the post from the origin. However, if the URL is broken, they can’t. And nobody fetches content once they have it. So you can play little games like posting a note to a particular instance which the users of that instance can interact with, but they won’t be able to send it off instance. Public, but not really.
Addressing and delivery is as much of a disaster as I thought. honk accidentally revealed a bug in Pleroma where I could make a post appear in Alice’s timeline, even if she didn’t follow me, by addressing it to Bob’s followers. I’m not sure where I got the idea to include someone’s followers on a reply, but it seemed like a pretty natural thing to do. Alas, it turns out people didn’t like it when I showed up in their feed uninvited. How sad. I’m sure there are many more exciting bugs to be found by constructing address lists not typically seen.
More generally, it seems people only test against what their own software produces, not what can be represented on the wire (which is anything). Evade a regex filter for bad words?
<span>b</span><span>a</span><span>d</span> Not exactly Gibson hacking skill level stuff.
honk uses the
context field to reconstitute threads. Software that doesn’t fill in the context field results in a pile of posts from wherever getting dumped into the empty context thread.
It’s hard to estimate the cost of running honk. It’s on a server that’s already been paid for, with various other things. I can say that the additional resource consumption is negligible.
I follow a few people who like to post images fairly frequently. Even so, the database is only about 700MB. Holds pretty steady there for 30 days of posts. Memory consumption is generally low. Looks like 23MB right now. This occasionally pops up when processing an image. So it should be more than possible to host honk on a very minimal VM. Or even host many honks.
honk started as an experiment in building a client for the network that I’d like to use. After using it for a bit over two months, I think it’s worked out pretty well. There’s been a slow accumulation of features and improvements, but the direction is about the same.
ActivityPub has its quirks, but it works well enough for honk. I can follow the people I’m interested in, the people interested in me can presumably do the same, honks get honked, honks get bonked. I found out about some people I didn’t know two months ago, I learned some new things. So I’m happy.
I’m also happy posting publicly. It seems unlikely someone will dig up an old link to an OpenBSD commit and use it against me. Other people may want to post more privately and have personal discussions, which is very reasonable, but I must say ActivityPub is not the greatest protocol for this. I think it’s kinda unfortunate that big M has established itself as a leader in the field when it’s a rather poor fit. I haven’t used them, but something like diaspora or hubzilla look substantially better from a technical perspective.
Here’s a pile of links that I found interesting.
Some notes about the times before ActivityPub and leaking info: https://blog.soykaf.com/post/pleroma-encyclical-activity-pub/.
Some notes about implementing AP: https://blog.soykaf.com/post/activity-pub-in-pleroma/.
Some notes about problems in AP: https://schub.io/blog/2018/02/01/activitypub-one-protocol-to-rule-them-all.html.
AP is still good, despite its flaws. With a section on the magic heuristics used to determine if a post is a direct message or a group message, followers only posts, and more, oh boy! https://blog.dereferenced.org/activitypub-the-present-state-or-why-saving-the-worse-is-better-virus-is.
A post about dealing with abuse. https://nolanlawson.com/2018/08/31/mastodon-and-the-challenges-of-abuse-in-a-federated-system/.
Interview with Mike Macgirvin and building protocols. https://medium.com/we-distribute/got-zot-mike-macgirvin-45287601ff19.
kaniini has been posting a lot about problems with ActivityPub and info leakage and archiving recently, but I’m not going to link to 30 different honks.
And finally, how to read the AP spec. https://tinysubversions.com/notes/reading-activitypub/.