rss table manners
I provide an RSS feed for flak. I also wrote a simplistic RSS feed reader for myself. The design of the latter was influenced by observing the behavior of existing readers.
There’s a small wave of fetchers that appear every five and ten minutes, converging with larger waves every fifteen minutes. These coalesce with a tidal wave at the top of every hour. My log file shows a whole lot of quiet interspersed with feeding frenzies at regular intervals.
This isn’t a problem, per se, because the total number of feeders is low, and the feed itself is very lightweight. But it’s easy to imagine a more popular blog with more content requiring an outsize investment in capacity to handle such an uneven request distribution.
What can a reader do to avoid such rude behavior? Check feeds at irregular times. For me, this was implemented as a check deadline for each feed. Each time the feed is checked, the deadline is incremented by a random amount between two and four hours. (One to two would work great, too. I’ve fluctuated a bit.) This means that not only is my fetcher not synced with other fetchers, but it’s not possible for it to even accidentally fall into lock step.
If everyone did things this way, that’s all that would be needed. But in a world populated with lock step feeders, there’s one more wrinkle. The fetch process is initiated by cron every five minutes, but the very first thing it does is sleep a random amount between one and three minutes before checking for expired deadlines, ensuring that we never hit a server during a hot minute.
I do this mostly because being polite to servers is the right thing to do, but clients benefit from being nice too. Requests to an idle server are more likely to succeed and faster. If multiple clients are sharing a link (or proxy), they can suffer the same kinds of congestion that busy servers do.
One can imagine that RSS feeds are not the only problem domain which benefits by decoupling a regular activity from a fixed time.