Debian Mirrors Hierarchy
After finding AlmaLinux mirror sync capacity at Tier 0 (or Tier 1, however you look at it) is around 140 Gbps, I wanted to find source and hierarchy in Debian mirroring systems.
There are two main types of mirrors in Debian - Debian package mirrors (for package installs and updates) also known as package archive and Debian CD mirrors (for ISO and other media). Let’s talk about package mirrors and their hierarchy first.
Package Mirrors Hierarchy
The trace file was a good starting point for checking upstream for a package mirror in Debian. It resides at <URL>/debian/project/trace/_traces and shows the flow of data. Sample trace file from jing.rocks’s mirror. It showed, canonical source for packages is ftp-master.debian.org. Checking via https://db.debian.org/machines.cgi, showed it’s fasolo.d.o hosted at Brown University, US. This serves as the “Master Archive Server”, making it the Tier 0 mirror. It’s entry mentions that it has 1 Gbps shared LAN connectivity (dated information?) but it only has to push to 3 other machines/sites.
Side note - .d.o is .debian.org
As shown on https://mirror-master.debian.org/status/mirror-hierarchy.html, the three sites are:
- syncproxy2.eu.debian.org ie smit.d.o hosted by University of Twente, Netherlands with 2x10 Gbps connectivity.
- syncproxy4.eu.debian.org ie schmelzer.d.o hosted by Conova in Austria with 2x10 Gbps connectivity.
- syncproxy2.wna.debian.org - https://db.debian.org/machines.cgi entry mentions it being hosted at UBC here, but IP seems to be pointing to OSUOSL IP range as of now. IIRC few months ago, syncproxy2.wna.d.o was made to point to another host due to some issue (?). mirror-osuosl.d.o seems to be serving as syncproxy2.wna.d.o now. Bandwidth isn’t explicitly mentioned but from my experience seeing bandwidths which other free software projects hosted at OSUOSL have, it would be atleast 10 Gbps and maybe more for Debian.
syncproxy2.eu.d.o (NL) ---> to the world
/
ftp-master.d.o (US) -- syncproxy4.eu.d.o (AT) --> to the world
\
syncproxy2.wna.d.o (US) --> to the world
These form the Debian Tier 1 mirror network, as all other mirrors sync from them. So we can say Debian has atleast 50 Gbps+ capacity at Tier 1. A normal Debian user might never directly interact with any of these 3 machines, but every Debian package they run/download/install flows through these machines. I’m unsure what wna stands for in syncproxy2.wna.d.o. NA probably is North America and W is west (coast)? If you know, do let me know.
After Tier 1, there are a few more sync proxies (detailed below). There are atleast 45 mirrors at Tier 2, updates for which are directly pushed from the three Tier 1 sync proxies. Most country mirrors i.e. ftp.<CC>.debian.org are at Tier 2 too (barring a few like ftp.au.d.o, ftp.nz.d.o etc).
Coming back to Sync proxies at Tier 2:
- syncproxy3.wna.debian.org - gretchaninov.d.o which is marked as syncproxy2 on db.d.o (information dated). It’s hosted at University of British Columbia, Canada, where a lot of Debian infrastructure including Salsa is hosted.
- syncproxy.eu.debian.org - Croatian Academic and Research Network managed machine. CNAME directs to debian.carnet.hr.
- syncproxy.au.debian.org - mirror-anu.d.o hosted by Australian National University with 100 Mbps connectivity. Closest sync proxy for all Australian mirrors.
- syncproxy4.wna.debian.org - syncproxy-aws-wna-01.d.o hosted in AWS, in US (according to GeoIP). IPv6 only (CNAME to syncproxy-aws-wna-01.debian.org. which only has an AAAA record, no A record). A m6g.2xlarge instance which has speeds upto 10 Gbps.
Coming back to https://mirror-master.debian.org/status/mirror-hierarchy.html, one can see chain extend till Tier 6 like in case of this mirror in AU which should add some latency for the updates from being pushed at ftp-master.d.o to them. Ideally, which shouldn’t be a problem as https://www.debian.org/mirror/ftpmirror#when, mentions “The main archive gets updated four times a day”.
In my case, I get my updates from NITC mirror, so my updates flow from US > US > TW > IN > me in IN.
CDNs have to internally manage cache purging too unlike normal mirrors which directly serve static file. Both deb.debian.org (sponsored by Fastly) and cdn-aws.deb.debian.org (sponsored by Amazon CloudFront) sync from following CDN backends:
- mirror.accumu.d.o hosted by Academic Computer Club in Umeå, Sweden.
- mirror-skroutz.d.o hosted by Skroutz Internet Services in Greece.
- schmelzer.d.o hosted by Conovo in Austria.
See deb.d.o trace file and cdn-aws.deb.d.o trace file.
(Thanks to Philipp Kern for the heads up here. The Mastodon thread has bunch of other interesting details.)
Update - https://salsa.debian.org/dsa-team/mirror/cdn-fastly/-/blob/master/services/archive.yaml?ref_type=heads shows Fastly CDN backends.
CD Image Mirrors Hierarchy
Till now, I have only talked about Debian package mirrors. When you see /debian directory on various mirrors, they’re usually for package installation and updates. If you want to grab the latest (and greatest) Debian ISO, you go to Debian CD (as they’re still called) mirror site.
casulana.d.o is mentioned as CD builder site hosted by Bytemark while pettersson-ng.d.o is mentioned as CD publishing server hosted at Academic Computer Club in Umeå, Sweden. The primary download site for Debian CD when you click download on debian.org homepage is https://cdimage.debian.org/debian-cd/ is hosted here as well. This essentially becomes Tier 0 mirror for Debian CD. All Debian CD mirrors are downstream to it.
pettersson-ng.d.o / cdimage.d.o (SE) ---> to the world
Academic Computer Club’s mirror setup uses a combination of multiple machines (called frontends and offloading servers) to load balance requests. Their document setup is a highly recommended read. Also, in that document, they mention , “All machines are reachable via both IPv4 and IPv6 and connected with 10 or 25 gigabit Ethernet, external bandwidth available is 200 gigabit/s.”
For completeness sake, following mirror (or mirror systems) exists too for Debian:
- Debian Ports mirrors.
- Debian Archive mirrors to get old Debian versions.
- Debian Security has a bunch of official mirrors (as mentioned here) behind security.d.o. It resolves to Fastly IP ranges, so it could be Fastly or Debian operated mirrors. Taking a look at https://db.debian.org/machines.cgi tells seger.d.o in DE is security-master which I’m assuming is source for all the following mentioned security mirrors:
- lobos.d.o in DE
- villa.d.o in DE
- wieck.d.o in DE
- schumann.d.o in DE
Update - https://salsa.debian.org/dsa-team/mirror/cdn-fastly/-/blob/master/services/security.yaml?ref_type=heads#L14 shows list of Fastly CDN backends.
Debian heavily relies on various organizations to donate resources (hosting and hardware) to distribute and update Debian. Compiling above information made me thankful to all these organizations. Many thanks to DSA and mirror team for managing these.
I relied heavily on https://db.debian.org/machines.cgi which seems to be manually updated, so things might have changed along the way. If anything looks amiss, feel free to ping.