4

Back when I used linux there was a program that would find you the fastest Debian package repository based on your location. Is there a similar one for CTAN? I've been manually picking ones that I think are physically close to me and basing it on the ping time, but that isn't super reliable. Just asking as I'm on a 100+ MBit/s download connection, but it still takes hours for me to install LaTeX and a very long time to update packages.

Canageek
  • 17,935
  • I've got a much slower Internet connection and the full TL installs within 45minutes. –  Jan 07 '16 at 20:23
  • @HenriMenke Wouldn't the fact that is on a different continent from me slow things down some? – Canageek Jan 07 '16 at 20:28
  • @Canageek You can at least see which mirrors are up here. – Henri Menke Jan 07 '16 at 20:41
  • @HenriMenke TeXLive Manager will give me a list of mirrors, but I can only guess where they are physically located (which worked fine when I lived down the road from a uni hosting one of them, but now that I'm 4500 km away I get much slower rates from that mirror) – Canageek Jan 07 '16 at 20:44
  • Why do you specify one at all? Did you have problems letting it pick automatically? – cfr Jan 07 '16 at 22:07
  • @cfr Because when I was near a university and manually specified a mirror it would go from 2 hours to download a new TeX Live install to under half an hour. Why would I use some random mirror located half a planet away? – Canageek Jan 08 '16 at 17:33
  • Seems strange it would pick one far away. That doesn't seem right. But it makes sense you'd want to override it in that case. Software doesn't always pick sensibly, even when it is intended to do so. (This used to happen a lot when downloading from SourceForge. Maybe still does - I hardly ever use the site now. It would always make some stupid choice of mirror for no obvious reason.) – cfr Jan 08 '16 at 21:45
  • @cfr Is it supposed to pick a close one? It could be I was just close to a slow one at the time. It is odd though: I was a 40 minute drive from UWaterloo at the time, hard to think another would be closer. – Canageek Jan 08 '16 at 23:58
  • I think so... At least, I'd assume so since that's how these things generally work. Not only for the sake of the end-user, but also for the sake of the servers. If the transfer rate is higher, the connection to the server is needed for a smaller amount of time etc. – cfr Jan 09 '16 at 00:11
  • @cfr Revisited this looking at questions I had bookmarked and I think I found the answer: I think it assumes that all servers in Canada are close, which....Canada is big, doesn't have very many mirrors, and Dalhousie (Halifax) is both REALLY far away from Vancouver, and seems to run its servers with a hamster. (It took 53 seconds to get the file list from them, and came out 9th slowest overall). University of Washington took 1.5 seconds. So yeah, manually specifying can reduce the download time by up to 35x. – Canageek Nov 26 '20 at 00:05
  • @Canageek That seems dumb. It works OK here, but the UK isn't big, so no server here is that far (and there aren't that many servers). You could raise it with the CTAN team. Maybe they can suggest something, even if they can't fix the issue generally. – cfr Nov 26 '20 at 04:50
  • @cfr Most Linux distributions also have that problem, and I have no idea how to reach out and am nervous about doing it without a suggested solution. – Canageek Nov 26 '20 at 21:14
  • @Canageek I use a script called reflector which you run to find the fastest mirrors for Arch Linux, so you could look at how that works, maybe? – cfr Dec 02 '20 at 03:52

2 Answers2

3

Assuming you're on Linux you can use this command to find the fastest mirror:

netselect -vv -t40 -s20 $(\
  curl -sSL http://dante.ctan.org/mirmon/ | \
  grep -oE '<TD ALIGN=RIGHT><A HREF="[^"]*' | \
  cut -d\" -f2 \
) | \
cut -c7- | \
LC_ALL=C xargs -tn1 -i curl -sSL -w "%{time_total} %{speed_download} {}\n" -o /dev/null {}FILES.byname | \
sort -n

This will give you the 20 fastest mirrors (actually measured download speed). It'll rank them by speed (fastest on top) and tell you how long it took to download the file list (/FILES.byname) and the average download speed.
It's also a fair bit verbose and will tell you what it does along the way. You can reduce the verbosiy by removing the -t flag in the xargs command and by removing 1 or both -v flags from the netselect command. Adjust the number of total servers you want to check for download speed with the parameter -s20.

Below is a version you can use in a shell script, that will only output the fastest mirror and nothing else. Very usefull for shell scripts.

netselect -t40 -s20 $(\
  curl -sSL http://dante.ctan.org/mirmon/ | \
  grep -oE '<TD ALIGN=RIGHT><A HREF="[^"]*' | \
  cut -d\" -f2 \
) | \
cut -c7- | \
LC_ALL=C xargs -n1 -i curl -sSL -w "%{time_total} {}\n" -o /dev/null {}FILES.byname | \
sort -n | \
head -n1 | \
cut -d\  -f2
3

netselect is no longer available using Ubuntu, so I've included a more general (if much slower) version of this answer that does not rely on it:

curl -sSL http://dante.ctan.org/mirmon/ | grep -oE '<TD ALIGN=RIGHT><A HREF="[^"]*' | cut -d\" -f2 | LC_ALL=C xargs -n1 -i curl -sSL -w "%{time_total} %{speed_download} {}\n" -o /dev/null {}FILES.byname | sort -n

This wasn't working for me today for some reason, so I used:

#!/bin/bash
if [ -z "$1" ]
then
    for mirror in `curl -sSL http://dante.ctan.org/mirmon/ | grep -oE '<TD ALIGN=RIGHT><A HREF="[^"]*' | cut -d\" -f2`
    do
        (
            host=`echo $mirror |sed s,.*//,,|sed s,/.*,,`
            echo -e `ping $host -c1 | grep time=|sed s,.*time=,,`:'  \t\t'$mirror
        ) &
        done
    wait
    exit 1
fi

and then piped that to sort -n

Canageek
  • 17,935
  • Choose Arch :). AUR still has it. Don't know if it still works, mind: last updated 25th Sep 2018! – cfr Nov 26 '20 at 04:53
  • Well, the second one won't work on my machine since I am behind a proxy, so ping outside the LAN does not work. I am able to use curl with the --proxy option, so curl --proxy 'http.proxy.firm.de:1234' -sSL http://dante.ctan.org/mirmon/ works. The first command, however, is very slow. Any chance to modify the second command to not use ping? Or somehow pass the proxy as argument? – winkmal Dec 19 '22 at 09:35
  • 1
    With the -m flag for max. timeout, I used the first command and it worked. So with the two additional arguments: ... curl -x 'http.proxy.firm.de:1234' -m 11 -sSL -w "%{time_total} %{speed_download} {}\n" ... – winkmal Dec 20 '22 at 08:21