bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Support non-ASCII URLs


From: Eli Zaretskii
Subject: Re: [Bug-wget] Support non-ASCII URLs
Date: Sat, 19 Dec 2015 14:11:20 +0200

> Date: Sat, 19 Dec 2015 10:15:03 +0200
> From: Eli Zaretskii <address@hidden>
> Cc: address@hidden
> 
> > 2. contrib/check-hard fails with
> > TESTS_ENVIRONMENT="LC_ALL=tr_TR.utf8 VALGRIND_TESTS=0" make check
> > 
> > FAIL: Test-iri-forced-remote
> > 
> > My son has birthday tomorrow, so I am not sure how much time I can spend on 
> > the weekend on this issue. Maybe Eli or you could have a look ?
> 
> I cannot bootstrap the Git repo (too many prerequisites I don't have).
> Can you or someone else produce a distribution tarball out of Git that
> I could then build "as usual"?
> 
> Also, can you show me the log of the failed test?  Turkish locales
> have "an issue" with certain upper/lower-case characters, maybe that's
> the problem.  Or maybe it's something else; looking at the log might
> give good clues.

Tim sent me the tarball and the log off-list (thanks!).  I didn't yet
try to build Wget, but just looking at the test, I guess I don't
understand its idea.  It has an index.html page that's encoded in
ISO-8859-15, but Wget is invoked with --remote-encoding=iso-8859-1,
and the URLs themselves in "my %urls" are all encoded in UTF-8.  How's
this supposed to work?

Also, I'm not following the logic of overriding Content-type by the
remote encoding: p1_fran%C3%A7ais.html states "charset=UTF-8", but
includes a link encoded in ISO-8859-1, and the test seems to expect
Wget to use the remote encoding in preference to what "charset=" says.
Does the remote encoding override the encoding for the _contents_ of
the URL, not just for the URL itself?  That seems to make little sense
to me: the contents and the name can legitimately be encoded
differently, I think.

I guess I lack some basic info about what Wget is supposed to do in
these tricky situations, and how.  Can you help me understand that?
The manual doesn't seem to be very details on what's expected here.

TIA



reply via email to

[Prev in Thread] Current Thread [Next in Thread]