[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file"
From: |
Matthew White |
Subject: |
Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format |
Date: |
Fri, 12 Aug 2016 22:13:53 +0200 |
On Wed, 10 Aug 2016 11:30:12 +0200
Tim Rühsen <address@hidden> wrote:
> On Mittwoch, 10. August 2016 06:23:49 CEST Matthew White wrote:
> > On Tue, 09 Aug 2016 21:25:16 +0200
> >
> > Tim Rühsen <address@hidden> wrote:
> > > On Freitag, 5. August 2016 20:25:06 CEST Matthew White wrote:
> > > > On Thu, 04 Aug 2016 16:47:18 +0200
> > > >
> > > > Tim Ruehsen <address@hidden> wrote:
> > > > > On Wednesday, August 3, 2016 1:46:56 PM CEST Matthew White wrote:
> > > > > > On Tue, 02 Aug 2016 11:27:08 +0200
> > > > > >
> > > > > > Tim Ruehsen <address@hidden> wrote:
> > > > > > > On Tuesday, August 2, 2016 10:06:42 AM CEST Matthew White wrote:
> > > > > > > > On Sat, 30 Jul 2016 21:23:56 +0200
> > > > > > > >
> > > > > > > > Matthew White <address@hidden> wrote:
> > > > > > > > > Hello!
> > > > > > > > >
> > > > > > > > > I noticed that wget doesn't use the metalink:file element as
> > > > > > > > > the
> > > > > > > > > RFC5854
> > > > > > > > > suggests.
> > > > > > > > >
> > > > > > > > > https://tools.ietf.org/html/rfc5854#section-4.1.2.1
> > > > > > > > >
> > > > > > > > > The RFC5854 specifies that the metalink:file could have a
> > > > > > > > > "path/file"
> > > > > > > > > format. In this case wget should create the "path" tree and
> > > > > > > > > save
> > > > > > > > > the
> > > > > > > > > file
> > > > > > > > > as "path/file", but it doesn't. Instead wget saves the file in
> > > > > > > > > the
> > > > > > > > > working directory.
> > > > > > > > >
> > > > > > > > > e.g. <file name="dirA/dirB/file.gz">
> > > > > > > > >
> > > > > > > > > With this patch wget conforms to the RFC5854.
> > > > > > > > >
> > > > > > > > > I made this patch working on the following branch:
> > > > > > > > > master (latest 20cac2c5ab3d63aacfba35fb10878a2d490e2377)
> > > > > > > > > git://git.savannah.gnu.org/wget.git
> > > > > > > > >
> > > > > > > > > Let me know if this helps.
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > After the suggestions of Tim, I fixed the patch (nice! fix the
> > > > > > > > fix...).
> > > > > > > > So,
> > > > > > > > scratch the previous patch and use this one instead.
> > > > > > > >
> > > > > > > > The function concat_strings() replaces the combination of
> > > > > > > > malloc(),
> > > > > > > > strlen(), and strcpy(). (Thanks Tim.)
> > > > > > >
> > > > > > > In this special case (just one string to clone), please use
> > > > > > > xstrdup().
> > > > > > > That is much less overhead.
> > > > > > >
> > > > > > > > The description now follows the style of the other patches.
> > > > > > > > (Thanks
> > > > > > > > again
> > > > > > > > Tim!)
> > > > > > > >
> > > > > > > > You may consider this patch a Bugfix:
> > > > > > > > * src/utils.c (unique_create, fopen_excl): cannot create a
> > > > > > > > directory
> > > > > > > > tree
> > > > > > > > like "path/file".
> > > > > > > >
> > > > > > > > The problem is that fopen_excl() doesn't create non-existing
> > > > > > > > directories.
> > > > > > > > What do you suggest?
> > > > > > >
> > > > > > > IMO, saving metalink files should work 'as usual and expected'.
> > > > > > > That
> > > > > > > means, all the directory options should apply (see man wget /
> > > > > > > Directory
> > > > > > > Options). We also have to deal with 'escaping' file name sequences
> > > > > > > like
> > > > > > > ../ etc.
> > > > > > > Not to forget character set conversions.
> > > > > > >
> > > > > > > What about cases where you download https://myserver/file.tgz, and
> > > > > > > in
> > > > > > > the
> > > > > > > metalink file 'name' is set to 'xyz.doc' ? IMO, we should keep
> > > > > > > file.tgz,
> > > > > > > except the user stated --trust-server-names.
> > > > > > >
> > > > > > > Maybe we should set up a a metalink document where we define how
> > > > > > > everything
> > > > > > > should work including corner cases. Next step would be to set up
> > > > > > > appropriate tests and fix the wget metalink code to survive the
> > > > > > > tests.
> > > > > > >
> > > > > > > > Can we make this patch obsolete?
> > > > > > >
> > > > > > > Basically yes.
> > > > > > > But we see, there is still much to do around metalink ... but most
> > > > > > > of
> > > > > > > the
> > > > > > > needed code is already there. Most work will be the definition and
> > > > > > > the
> > > > > > > tests.
> > > > > > >
> > > > > > > Regards, Tim
> > > > > >
> > > > > > Changed to use xstrdup(), new patch attached.
> > > > >
> > > > > I already mentioned, that taking the metalink filename pulls in a
> > > > > security
> > > > > issue. It might escape the current directory with a relative or
> > > > > absolute
> > > > > path.
> > > >
> > > > Actually, wget aborts if something like "../path/file" is found as
> > > > metalink:file name (see attached document as patch). The same goes for
> > > > "./path/file" and "/path/file".
> > > >
> > > > > You might just take the file name without path, but only if the the
> > > > > option
> > > > > -- trust-server-names has been set.
> > > > >
> > > > > Tim
> > > >
> > > > Hi Tim,
> > > >
> > > > I got your PM, did you get mine (I had an email server problem)?
> > > >
> > > > I wrote a draft Metalink.README trying to explain the current
> > > > interaction
> > > > between "Directory Options" and the option '--include-metalink=file' on
> > > > the
> > > > command line.
> > > >
> > > > I attach the Metalink.README here as a patch.
> > > >
> > > > Can you confirm my findings? (Don't rush)
> > >
> > > I created a new upstream branch 'metalink' where we can put further work
> > > for testing. I included your README as doc/metalink.txt and also added
> > > two tests for '../File1' and '/File1', which currently fail (the tests
> > > expect 'File1' to be written).
> > >
> > > You can't push there, but maybe you clone it (e.g. to Gitlab, Github or
> > > elsewhere) and push your changes to that repo. So I can directly merge
> > > from
> > > there (and you could then merge from upstream 'metalink').
> > >
> > > I still didn't find time to think about the Metalink file naming
> > > details...
> > >
> > > Regards, Tim
> >
> > Hi Tim,
> >
> > I forked https://github.com/mirror/wget.
> >
> > As you suggest, now I'm pushing to the metalink branch on my repo:
> > https://github.com/mehw/wget/tree/metalink
> >
> > As you said, escaping relative and absolute paths from metalink:file names,
> > except when the option --trust-server-names is used, seems a good idea.
> > I'll work on this ASAP.
>
> I thought about it when I awoke this night...
>
> Let's assume the user types basically
> wget http://theserver.com/directory/archive.tgz
> The server responds with a metalink file (or metalink within request header),
> wget retrieves the .meta file, which states the file name
> '/the/path/funny.sh'.
>
> IMO, the (optional) directory options should apply to the URL path of the
> command line (http://theserver.com/directory/).
>
> The downloaded/saved file name should be 'archive.tgz' - except with --trust-
> server-names, here the file name should be 'funny.sh' (but still, any
> directory
> options should apply to the command line URL).
>
> In general, I would recommend to tell the user when the filename from his URL
> and the file name in the Metalink info differ. If --trust-server-names was
> not
> given, we should give this as hint.
>
> But we should *always* strip the path info from the metalink filename (/the/
> path/funny.sh -> funny.sh), if appropriate (--trust-server-names is given).
>
> If there are several files with the same name in the Metalink info and
> --trust-
> server-names is given, we should apply the standard .1, .2, .3 ... extensions.
>
> WDYT ?
>
> > Shall I create a new branch or push directly to metalink in my repo?
>
> Just push directly to your repo.
>
> Regards, Tim
Hi Tim,
After debugging wget and libmetalink, I can confirm that, due to how
metalink/libmetalink is conceived (see references), metalink:file names posing
a security issue are discarded directly by libmetalink, and so they will never
get to the wget's metalink module.
e.g. '../File' and '/File1' cannot be written as 'File1' by wget, because the
whole metalink:file name is discarded by libmetalink.
Wget commit f4aeb4189958cf9b2b6d134fd42b89bc9a3c401f:
* src/main.c (main): Call metalink_parse_file(): libmetalink's
lib/libexpat_metalink_parser.c or lib/libxml2_metalink_parser.c
https://github.com/metalink-dev/libmetalink
* lib/metalink_pstate_v4.c (metalink_state_start_fun_v4): Call
metalink_check_safe_path() to verify metalink:file names
* lib/metalink_pstate_v3.c (files_state_start_fun_v3): Call
metalink_check_safe_path() to verify metalink:file names
* lib/metalink_helper.c (metalink_check_safe_path): Discard unsafe file names;
see http://tools.ietf.org/html/rfc6266#section-4.3,
http://tools.ietf.org/html/rfc2183#section-5, and
http://tools.ietf.org/html/rfc5854.html#section-4.1.2.1
* test/metalink_helper_test.c (test_metalink_check_safe_path): Tests for file
names
PS: I'm still working on --trust-server-names and the directory options.
WDYT?
Later,
Matthew
--
Matthew White <address@hidden>
pgpTPU0D7wuTh.pgp
Description: PGP signature
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Matthew White, 2016/08/02
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Tim Ruehsen, 2016/08/02
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Matthew White, 2016/08/03
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Tim Ruehsen, 2016/08/04
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Matthew White, 2016/08/05
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Tim Rühsen, 2016/08/09
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Matthew White, 2016/08/10
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Tim Rühsen, 2016/08/10
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Matthew White, 2016/08/11
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format,
Matthew White <=
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Tim Rühsen, 2016/08/13
- Re: [Bug-wget] [PATCH] Support metalink:file elements with a "path/file" format, Matthew White, 2016/08/14