[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Wget recursive option not working correctly with scheme relative URL
From: |
Tim Rühsen |
Subject: |
Re: Wget recursive option not working correctly with scheme relative URLs |
Date: |
Sat, 1 Jul 2023 18:22:50 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 |
Hey Jan,
On 7/1/23 15:16, Jan Bidler via Primary discussion list for GNU Wget wrote:
Hello,
I have part of a website (`example.com/index.html`) I want to mirror which
contains scheme relative URLs (`//otherexample.com/image.png`). Trying to
download these with the -r flag, results in wget converting them to a wrong URL
(`example.com//otherexample.com`).
So using
`wget -r example.com/index.html`
Will cause links with
`https://example.com/index.html\/\/otherexample.com\/image.png` in the output
Using the debug flag reveals this:
`merge(»example.com/index.html «, » //otherexample.com/image.png«) ->
https://example.com/index.html\/\/otherexample.com\/image.png
[`](https://example.com/index.html//otherexample.com/image.png`)
This is unexpected since these kind of links are relatively common and
so far nobody complaint about it.
I just added a new test function for uri_merge(), the function that does
this job. It has no issue to merge a relative URL like
'//otherexample.com/image.png'.
So is it possible to share a real world wget command line to reproduce
the issue ?
Regards, Tim
OpenPGP_signature
Description: OpenPGP digital signature