[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Savannah-hackers-public] Truncated tarball from git.savannah.gnu.or

From: Bob Proulx
Subject: Re: [Savannah-hackers-public] Truncated tarball from
Date: Wed, 8 Feb 2017 22:22:48 -0700
User-agent: NeoMutt/20170113 (1.7.2)

Eli Zaretskii wrote:
> James Cloos wrote:
> > It looks like there is a 60 second limit.

Yes.  There appeared to be a 60 second limit.

> > And the transmission is unnaturally slow.  My test averaged only 154KB/s
> > even though I ran it on a machine in a very well connected data center
> > near Boston which supports more than 1G incoming bandwidth.
> I think the tarball is produced on the fly, so it isn't the bandwidth

Yes.  The tar file is produced on the fly and then compressed with
xz.  This is quite a cpu intensive operation.  It pegs one core at
100% cpu during the operation.  It takes 3 minutes on a well connected
machine to create and download a tar.xz file.

> that limits the speed, it's the CPU processing resources needed to
> xz-compress the files.  Try the same with .tar.gz, and you will see
> quite a different speed.

Using gzip is much less stressful on the cpu.  It only takes 1m30s to
create and download a tar.gz file.  The gz is a larger file than the
xz but the overall impact of the gz is less.

> > The 60s limit needs to be much longer; I doubt that it should be any
> > less than ten minutes.

There is a read timeout that can be hit such that the data must start
transferring before the timeout occurs or the web server thinks the
process has failed.  In this case I think the start is after it has
finished the compression.  After it starts transfering data then reads
continue and the read timeout resets.

> No, I think 3 min should be enough.  But I don't really understand why
> there has to be a limit.

There must be limits because otherwise the activity of the global
Internet hitting the server will drive it out of resources creating
what is indistinguishable from a denial of service attack.  There must
be limits to prevent clients from consuming all server resources.
That is just a fact of life when running a busy public server.  You
never have enough resources for everything.  You can't.  Because there
are more clients on the net than you have server resources.  All it
takes is for someone to say that there is a new release and that
synchronizes many people to go download all at the same time and the
system become overwhelmed.

In any case, I am coming back to this thread because we have just
moved git off of the old server and onto the new server.  We are just
now starting to tune the parameters on the new system.  If you try
this again you will find the current read time limit for data to start
transferring to be 300s.  Plus the new system should be faster than
the old one.  The combined effect should be much better.  But remember
that we can't make it unlimited.

Frankly from the server perspective I don't like the cgit dynamic tar
file creation on the server.  It has quite an impact on it.  It is
easier on the server if people keep their own copy of a git clone
updated and build the release tar files on the local ciient system
rather than on the server system.  Then updates to the git repository
are incremental.  Much less impact on the server.  Or to have
maintainers create the tar file once and then simply serve that file
out repeatedly from a download server.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]