flak rss random

what is the go proxy even doing?

The go module proxy caches requests from users, so everyone has a consistent and reliable experience regardless of upstream host. But what if the go proxy is contributing to the instability of the upstream host? There have been other complaints about the go proxy, that’s just the way it works, but I collected a few minutes of logs to examine that may be interesting.

Once the proxy becomes aware of a module, it downloads it repeatedly, presumably looking for new versions. I don’t know how useful this is, because the proxy will also download unknown revisions just in time. At least in my case, I always push the new tag, then update the consumer, so there’s basically no window where the proxy will find the new version before it’s requested.

As we’ll see, when I say “download”, I mean do a complete hg clone, from scratch. Every time. The go project used to be stored in mercurial, so it’s surprising that nobody knows how much more efficient it is to just run hg pull. Especially if you’re not expecting to find changes. It can be days or even weeks between tags for stable repos. Most of my repos are fairly small, even with complete history, but there are definitely some repos out there with large artifacts checked in. There are gigabytes of prebuilt archives in the wasmtime-go repo. Relentlessly downloading that is going to cause issues.

To alleviate these issues, my server only allows the go proxy to complete one clone per repo per 24 hours. Then you get a 429.

Enough chatter. Let’s look at some logs. These have been lightly edited to omit extraneous detail.

Here’s the background radiation. This is the steady state of the proxy trying to clone a repo and getting rejected.

2025/05/19 20:00:07 goog coming too fast: gruss 172.253.217.54 next: 15:28:17
2025/05/19 20:00:25 goog coming too fast: gozstd 172.253.245.133 next: 15:26:58
2025/05/19 20:01:37 goog coming too fast: azorius 74.125.76.138 next: 15:30:08
2025/05/19 20:03:48 goog coming too fast: go-sqlite3 173.194.96.183 next: 15:32:38
2025/05/19 20:06:39 goog coming too fast: fungo 172.253.251.186 next: 15:36:49
2025/05/19 20:09:45 goog coming too fast: go-sqlite3 172.253.245.131 next: 15:32:38
2025/05/19 20:11:45 goog coming too fast: honk 172.253.192.247 next: 16:31:45
2025/05/19 20:14:04 goog coming too fast: gerc 172.253.245.69 next: 15:44:14
2025/05/19 20:15:08 goog coming too fast: glfw3 172.253.245.142 next: 15:44:17
2025/05/19 20:15:34 goog coming too fast: webs 74.125.17.213 next: 15:24:38
2025/05/19 20:18:49 goog coming too fast: webs 173.194.98.206 next: 15:24:38
2025/05/19 20:20:47 goog coming too fast: gozstd 173.194.96.131 next: 15:26:58
2025/05/19 20:23:58 goog coming too fast: termvc 172.253.217.63 next: 15:29:58
2025/05/19 20:24:37 goog coming too fast: gruss 172.253.217.56 next: 15:28:17
2025/05/19 20:25:15 goog coming too fast: gozstd 172.253.245.133 next: 15:26:58
2025/05/19 20:25:24 goog coming too fast: anticrawl 172.253.214.52 next: 15:27:58
2025/05/19 20:26:28 goog coming too fast: azorius 173.194.96.132 next: 15:30:08
2025/05/19 20:28:42 goog coming too fast: go-sqlite3 172.253.214.52 next: 15:32:38
2025/05/19 20:31:38 goog coming too fast: fungo 172.253.192.116 next: 15:36:49
2025/05/19 20:33:54 goog coming too fast: go-sqlite3 172.253.251.179 next: 15:32:38
2025/05/19 20:36:15 goog coming too fast: honk 74.125.179.118 next: 16:31:45
2025/05/19 20:38:44 goog coming too fast: gerc 172.253.7.125 next: 15:44:14
2025/05/19 20:39:54 goog coming too fast: webs 173.194.96.182 next: 15:24:38
2025/05/19 20:39:57 goog coming too fast: glfw3 173.194.96.135 next: 15:44:17
2025/05/19 20:45:27 goog coming too fast: gozstd 172.253.192.125 next: 15:26:58
2025/05/19 20:48:47 goog coming too fast: termvc 172.253.214.43 next: 15:29:58
2025/05/19 20:48:48 goog coming too fast: gruss 74.125.187.228 next: 15:28:17
2025/05/19 20:49:48 goog coming too fast: anticrawl 172.253.217.51 next: 15:27:58
2025/05/19 20:49:55 goog coming too fast: gozstd 74.125.187.235 next: 15:26:58
2025/05/19 20:51:08 goog coming too fast: azorius 172.253.214.40 next: 15:30:08
2025/05/19 20:53:28 goog coming too fast: go-sqlite3 74.125.17.218 next: 15:32:38

Boink. Boink. Boink.

Now I’m going to push a change. This resets the counter, so I will be able to update immediately.

2025/05/19 20:54:10 login for tedu
2025/05/19 20:54:11 push to webs

A few seconds later we have a new pull request. This is the result of me running go mod tidy which triggers a request to work its way through the big cloud.

2025/05/19 20:54:19 request for webs: 'capabilities' '' 0
2025/05/19 20:54:19 request for webs: 'listkeys' 'namespace=bookmarks' 1
2025/05/19 20:54:19 request for webs: 'batch' 'cmds=heads+%3Bknown+nodes%3D' 1
2025/05/19 20:54:19 request for webs: 'getbundle' 'common=0000000000000000000000000000000000000000&heads=204f8eab5c2a421f574f6b1f25e50e5ebea7aa8c' 2
2025/05/19 20:54:19 pulling from webs

For those that don’t read mercurial, that’s a clone request from scratch. Give me everything you have, starting from the beginning. As noted, a pull would be more efficient, but it’s not so bad. Yet.

And now here comes the flood.

2025/05/19 20:54:33 request for webs: 'capabilities' '' 0
2025/05/19 20:54:33 request for webs: 'capabilities' '' 0
2025/05/19 20:54:33 request for webs: 'capabilities' '' 0
2025/05/19 20:54:33 request for webs: 'capabilities' '' 0
2025/05/19 20:54:33 request for webs: 'listkeys' 'namespace=bookmarks' 1
2025/05/19 20:54:33 request for webs: 'capabilities' '' 0
2025/05/19 20:54:33 request for webs: 'listkeys' 'namespace=bookmarks' 1
2025/05/19 20:54:33 request for webs: 'batch' 'cmds=heads+%3Bknown+nodes%3D' 1
2025/05/19 20:54:33 request for webs: 'listkeys' 'namespace=bookmarks' 1
2025/05/19 20:54:33 request for webs: 'listkeys' 'namespace=bookmarks' 1
2025/05/19 20:54:33 request for webs: 'batch' 'cmds=heads+%3Bknown+nodes%3D' 1

That’s five clone requests, being concurrently executed from different machines. It took fourteen seconds for the single node to notice there was something new and sound the alarm, and now the thundering herd is here for real.

Did the intern who built this not make it to their distributed systems class yet? The one node wakes up and says “there’s something new” but instead of saying “here it is; I already have it” it tells everyone “get your own copy”. Why would you build a system like this? I know Google data centers are well connected, but surely they have even better cross connects internally? How can it be faster for every machine to hammer the upstream, all at the same time?

The flood continues for some time. Here’s just the 429 blocked requests, not the lightweight capabilities and listkeys, etc. requests.

2025/05/19 20:54:33 goog coming too fast: webs 74.125.184.187 next: 20:54:19
2025/05/19 20:54:33 goog coming too fast: webs 74.125.184.190 next: 20:54:19
2025/05/19 20:54:33 goog coming too fast: webs 172.253.9.165 next: 20:54:19
2025/05/19 20:54:33 goog coming too fast: webs 173.194.92.166 next: 20:54:19
2025/05/19 20:54:33 goog coming too fast: webs 74.125.115.19 next: 20:54:19
2025/05/19 20:54:34 goog coming too fast: webs 74.125.184.186 next: 20:54:19
2025/05/19 20:54:35 goog coming too fast: webs 172.253.214.54 next: 20:54:19
2025/05/19 20:54:35 goog coming too fast: webs 173.194.96.141 next: 20:54:19
2025/05/19 20:54:35 goog coming too fast: webs 173.194.92.169 next: 20:54:19
2025/05/19 20:54:36 goog coming too fast: webs 173.194.98.197 next: 20:54:19
2025/05/19 20:54:38 goog coming too fast: webs 74.125.179.127 next: 20:54:19
2025/05/19 20:54:38 goog coming too fast: webs 74.125.179.114 next: 20:54:19
2025/05/19 20:54:38 goog coming too fast: webs 74.125.179.114 next: 20:54:19
2025/05/19 20:54:41 goog coming too fast: webs 172.253.245.69 next: 20:54:19
2025/05/19 20:54:41 goog coming too fast: webs 74.125.115.18 next: 20:54:19
2025/05/19 20:54:42 goog coming too fast: webs 172.253.195.225 next: 20:54:19
2025/05/19 20:54:42 goog coming too fast: webs 172.253.192.125 next: 20:54:19
2025/05/19 20:54:44 goog coming too fast: webs 74.125.76.106 next: 20:54:19
2025/05/19 20:54:44 goog coming too fast: webs 74.125.179.110 next: 20:54:19
2025/05/19 20:54:45 goog coming too fast: webs 172.253.192.116 next: 20:54:19
2025/05/19 20:54:45 goog coming too fast: webs 74.125.76.107 next: 20:54:19
2025/05/19 20:54:47 goog coming too fast: webs 74.125.115.25 next: 20:54:19

And so it goes. Eventually the herd will satiate itself with 429s, and return to the steady state of “only” checking every few minutes.

Something to note is that even though I only provide one copy of the new tag to the proxy, it quickly percolates through the cloud. I switch to a different laptop, run go build, and immediately download it. I switch to my server, and also successfully pull the update through the proxy. So the proxy is absolutely capable of sharing updates internally. All these machines which independently came crawling for the new tag? It was available to them the whole time. But apparently Google thinks it’s cheaper to download from my server than from their own cloud storage.

I don’t have quite so many repos that the proxy’s behavior was ever a problem, but it’s still annoying just in principle. And I’ve solved that with the 429 solution, but even so, what a weird design for a distributed system.

Posted 14 Aug 2025 17:22 by tedu Updated: 14 Aug 2025 17:22
Tagged: go web
V
V