bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] recursive no parent slash bug


From: Micah Cowan
Subject: Re: [Bug-wget] recursive no parent slash bug
Date: Tue, 23 Jun 2009 11:43:13 -0700
User-agent: Thunderbird 2.0.0.21 (X11/20090409)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike Turchenkov wrote:
> Hello, thanks for great program!
> 
> I have noticed a little bug with GNU Wget 1.11.1
> 
> Suppose
> GNU Wget 1.11.1
> http://site.com/content/AAA/aaa.files
> http://site.com/content/BBB/bbb.files
> 
> and I want to get only aaa.files in current dir.
> 
> When I input in
> 
> wget -r --level=5 -S -nH --cut-dirs=2 -N --http-user=***
> --http-passwd=***  -np -w 1 -o logfile -e robots=off
> "http://site.com/content/AAA";
> 
> It gets not only aaa.files but bbb.files too.
> 
> But with "/" in the end of address
> 
> wget -r --level=5 -S -nH --cut-dirs=2 -N --http-user=***
> --http-passwd=***  -np -w 1 -o logfile -e robots=off
> "http://site.com/content/AAA/";
> 
> it works well.

Not a bug. HTTP has no way to differentiate between the idea of a
"directory" and a "file", and no such difference actually exists. It's a
user perspective that Wget tries to adhere to; but the only reliable
means Wget has to differentiate between a file and a directory is by
means of the trailing slash, otherwise there is absolutely no way to
know whether the final AAA is a file or a "directory".

(There is a heuristic that would be possible on _some_ servers, which we
may implement at some point, which is that if the server automatically
redirects "AAA" to "AAA/", we should assume that there should have been
a trailing slash to begin with, and proceed accordingly.)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
Maintainer of GNU Wget and GNU Teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpBIkEACgkQ7M8hyUobTrHHjgCfQsKQNUo1SxrYdUiLDj3at1Xi
ZZIAn1EdXtHgoJuTDXCSR9HnN9sVDbgM
=RJ90
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]