[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Are gzip-compressed substitutes still used?
From: |
Ludovic Courtès |
Subject: |
Re: Are gzip-compressed substitutes still used? |
Date: |
Wed, 17 Mar 2021 18:12:05 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) |
Hi,
Ludovic Courtès <ludo@gnu.org> skribis:
> From that, we could deduce that about 1% of our users who take
> substitutes from ci.guix are still using a pre-1.1.0 daemon without
> support for lzip compression.
>
> I find it surprisingly low: 1.1.0 was released “only” 9 months ago,
> which is not a lot for someone used to the long release cycles of
> “stable” distros.
(See
<https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00378.html>
for the initial message.)
Here’s an update, 1.5 month later. This time I’m looking at nginx logs
covering Feb 8th to Mar 17th and using a laxer regexp than in the
message above, here are the gzip/lzip download ratio for several
packages:
--8<---------------cut here---------------start------------->8---
ludo@berlin ~$ ./nar-download-stats.sh /tmp/sample3.log
gtk%2B-3: gzip/lzip ratio: 37/3255 1%
glib-2: gzip/lzip ratio: 97/8629 1%
coreutils-8: gzip/lzip ratio: 81/2306 3%
python-3: gzip/lzip ratio: 120/7177 1%
r-minimal-[34]: gzip/lzip ratio: 8/302 2%
openmpi-4: gzip/lzip ratio: 19/236 8%
hwloc-2: gzip/lzip ratio: 10/43 23%
gfortran-7: gzip/lzip ratio: 6/225 2%
--8<---------------cut here---------------end--------------->8---
(Script attached.)
The hwloc/openmpi outlier is intriguing. Is it one HPC web site running
an old daemon, or several of them? Looking more closely, it’s 22 of
them on 8 different networks (looking at the first three digits of the
IP address):
--8<---------------cut here---------------start------------->8---
ludo@berlin ~$ grep -E
'/gzip/[[:alnum:]]{32}-(hwloc-2|openmpi-4)\.[[:digit:]]+\.[[:digit:]]+ ' <
/tmp/sample3.log | cut -f1 -d- | sort -u | wc -l
22
ludo@berlin ~$ grep -E
'/gzip/[[:alnum:]]{32}-(hwloc-2|openmpi-4)\.[[:digit:]]+\.[[:digit:]]+ ' <
/tmp/sample3.log | cut -f1 -d- | cut -f 1-3 -d. | sort -u | wc -l
8
--8<---------------cut here---------------end--------------->8---
Conclusion? It still sounds like we can’t reasonably remove gzip
support just yet.
I’d still like to start providing zstd-compressed substitutes though.
So I think what we can do is:
• start providing zstd substitutes on berlin right now so that when
1.2.1 comes out, at least some substitutes are available as zstd;
• when 1.2.1 is announced, announce that gzip substitutes may be
removed in the future and invite users to upgrade;
• revisit this issue with an eye on dropping gzip within 6–18 months.
Thoughts?
Ludo’.
#!/bin/sh
if [ ! "$#" = 1 ]
then
echo "Usage: $1 NGINX-LOG-FILE"
exit 1
fi
set -e
sample="$1"
items="gtk%2B-3 glib-2 coreutils-8 python-3 r-minimal-[34] openmpi-4 hwloc-2
gfortran-7"
for i in $items
do
# Tweak the regexp so we don't catch ".drv" substitutes as these
# usually compress better with gzip.
lzip="$(grep -E "/lzip/[[:alnum:]]{32}-$i\\.[[:digit:]]+(\\.[[:digit:]]+)?
" < "$sample" | wc -l)"
gzip="$(grep -E "/gzip/[[:alnum:]]{32}-$i\\.[[:digit:]]+(\\.[[:digit:]]+)?
" < "$sample" | wc -l)"
echo "$i: gzip/lzip ratio: $gzip/$lzip $(($gzip * 100 / $lzip))%"
done
- Re: Are gzip-compressed substitutes still used?,
Ludovic Courtès <=
Re: Are gzip-compressed substitutes still used?, zimoun, 2021/03/17
Re: Are gzip-compressed substitutes still used?, Jonathan Brielmaier, 2021/03/17
Re: Are gzip-compressed substitutes still used?, Pierre Neidhardt, 2021/03/18