[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] [Bug-Wget] Misc. patches
From: |
Darshit Shah |
Subject: |
Re: [Bug-wget] [Bug-Wget] Misc. patches |
Date: |
Sat, 5 Jul 2014 16:01:02 +0530 |
On Sat, Jul 5, 2014 at 3:45 PM, Giuseppe Scrivano <address@hidden> wrote:
> Hi Darshit,
>
> few comments below:
>
> Darshit Shah <address@hidden> writes:
>
>> Hi,
>>
>> I've attached 3 patches to this mail. The first one eliminates some of
>> the error codes in Wget as a first step to cleaning up the error
>> reporting method.
>>
>> The second is siimply fixing indentation issues.
>>
>> The third updates our usage of libpsl according to the latest release
>> version and converts all domain names to lowercase before passing them
>> to libpsl for checking.
>>
>> --
>> Thanking You,
>> Darshit Shah
>>
>> From 64b57d26b6283fe1039999ad3aabbee8dbaa4c97 Mon Sep 17 00:00:00 2001
>> From: Darshit Shah <address@hidden>
>> Date: Sat, 5 Jul 2014 12:08:09 +0530
>> Subject: [PATCH 3/3] Convert domains to lowercase before libpsl checks
>>
>> ---
>> src/ChangeLog | 5 +++++
>> src/cookies.c | 26 ++++++++++++++++++++++++--
>> 2 files changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/ChangeLog b/src/ChangeLog
>> index 5306cb2..6360303 100644
>> --- a/src/ChangeLog
>> +++ b/src/ChangeLog
>> @@ -1,5 +1,10 @@
>> 2014-07-05 Darshit Shah <address@hidden>
>>
>> + * cookies.c (check_domain_match): Libpsl requires that all domain names
>> + passed to it be in utf8 lower case.
>> +
>> +2014-07-05 Darshit Shah <address@hidden>
>> +
>> * http.c (gethttp): Fix indentation of conditional block
>> (gethttp): Remove unneeded variable
>>
>> diff --git a/src/cookies.c b/src/cookies.c
>> index a46aeee..b478060 100644
>> --- a/src/cookies.c
>> +++ b/src/cookies.c
>> @@ -501,7 +501,17 @@ numeric_address_p (const char *addr)
>> /* Check whether COOKIE_DOMAIN is an appropriate domain for HOST.
>> Originally I tried to make the check compliant with rfc2109, but
>> the sites deviated too often, so I had to fall back to "tail
>> - matching", as defined by the original Netscape's cookie spec. */
>> + matching", as defined by the original Netscape's cookie spec.
>> +
>> + Wget now uses libpsl to check domain names against a public suffix
>> + list to see if they are valid. However, since we don't provide a
>> + psl on our own, if libpsl is compiled without a public suffix list,
>> + fall back to using the original "tail matching" heuristic. Also if
>> + libpsl is unable to convert the domain to lowercase, which means that
>> + it doesnt have any runtime conversion support, we again fall back to
>> + "tail matching" since libpsl states the results are unpredictable with
>> + upper case strings.
>> + */
>>
>> static bool
>> check_domain_match (const char *cookie_domain, const char *host)
>> @@ -509,6 +519,7 @@ check_domain_match (const char *cookie_domain, const
>> char *host)
>>
>> #ifdef HAVE_LIBPSL
>> DEBUGP (("cdm: 1"));
>> + char * cookie_domain_lower, * host_lower;
>
> please initialize them to NULL and format like char
> *cookie_domain_lower, *host_lower (no space between * and the variable
> name), otherwise...
>
>> const psl_ctx_t *psl;
>> int is_acceptable;
>>
>> @@ -519,7 +530,18 @@ check_domain_match (const char *cookie_domain, const
>> char *host)
>> goto no_psl;
>> }
>>
>> - is_acceptable = psl_is_cookie_domain_acceptable (psl, host,
>> cookie_domain);
>> + if (psl_str_to_utf8lower (cookie_domain, NULL, NULL,
>> &cookie_domain_lower) != PSL_SUCCESS ||
>> + psl_str_to_utf8lower (host, NULL, NULL, &host_lower) != PSL_SUCCESS)
>
> ...if the first "psl_str_to_utf8lower" fails then "host_lower" keeps
> some bogus value...
>
>> + {
>> + DEBUGP (("libpsl unable to parse domain name. "
>> + "Falling back to simple heuristics.\n"));
>> + goto no_psl;
>> + }
>> +
>> + is_acceptable = psl_is_cookie_domain_acceptable (psl, host_lower,
>> cookie_domain_lower);
>> + xfree (cookie_domain_lower);
>> + xfree (host_lower);
>
> ...and *boom* here.
>
Aah! I somehow managed not to get any "boom"s despite having a test
that saw psl_str_to_utf8lower() fail. However, your comment is correct
and I'll fix that. The general idea was that if the function fails, it
will fail on both the calls
>> +
>> return true ? (is_acceptable == 1) : false;
>>
>> no_psl:
>> --
>> 2.0.1
>>
>
> feel free to push after the change.
>
>
>
>
>> From eccb40cac9aa5a446a2b6c1eb61710930223106b Mon Sep 17 00:00:00 2001
>> From: Darshit Shah <address@hidden>
>> Date: Sat, 5 Jul 2014 11:54:46 +0530
>> Subject: [PATCH 2/3] Fix indentation and remove excess variable
>>
>> ---
>> src/ChangeLog | 5 +++++
>> src/http.c | 21 ++++++++++-----------
>> 2 files changed, 15 insertions(+), 11 deletions(-)
>
> ACK.
>
>
>
>
>> From 91fe00af91825bd80f4b300ba45d5874c3c79a46 Mon Sep 17 00:00:00 2001
>> From: Darshit Shah <address@hidden>
>> Date: Thu, 3 Jul 2014 20:53:33 +0530
>> Subject: [PATCH 1/3] Remove usunused error codes
>>
>> ---
>> src/ChangeLog | 5 +++++
>> src/http.c | 2 +-
>> src/wget.h | 14 ++++++--------
>> 3 files changed, 12 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/ChangeLog b/src/ChangeLog
>> index d19dc8d..7962213 100644
>> --- a/src/ChangeLog
>> +++ b/src/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2014-07-03 Darshit Shah <address@hidden>
>> +
>> + * wget.h (uerr_t): Remove unused error codes
>> + * http.c: (http_loop): Remove reference to unused error code
>> +
>> 2014-06-30 Giuseppe Scrivano <address@hidden>
>>
>> * convert.c (local_quote_string): Initialize newname.
>> diff --git a/src/http.c b/src/http.c
>> index 383e9f3..4fac3ba 100644
>> --- a/src/http.c
>> +++ b/src/http.c
>> @@ -3169,7 +3169,7 @@ Spider mode enabled. Check if remote file exists.\n"));
>>
>> switch (err)
>> {
>> - case HERR: case HEOF: case CONSOCKERR: case CONCLOSED:
>> + case HERR: case HEOF: case CONSOCKERR:
>> case CONERROR: case READERR: case WRITEFAILED:
>> case RANGEERR: case FOPEN_EXCL_ERR:
>> /* Non-fatal errors continue executing the loop, which will
>> diff --git a/src/wget.h b/src/wget.h
>> index 331e8e2..3b6ff86 100644
>> --- a/src/wget.h
>> +++ b/src/wget.h
>> @@ -334,21 +334,19 @@ typedef enum
>> {
>> /* 0 */
>> NOCONERROR, HOSTERR, CONSOCKERR, CONERROR, CONSSLERR,
>> - CONIMPOSSIBLE, NEWLOCATION, NOTENOUGHMEM /* ! */,
>> - CONPORTERR /* ! */, CONCLOSED /* ! */,
>> + CONIMPOSSIBLE, NEWLOCATION,
>> /* 10 */
>> FTPOK, FTPLOGINC, FTPLOGREFUSED, FTPPORTERR, FTPSYSERR,
>> - FTPNSFOD, FTPRETROK /* ! */, FTPUNKNOWNTYPE, FTPRERR, FTPREXC /* ! */,
>> + FTPNSFOD, FTPUNKNOWNTYPE, FTPRERR,
>> /* 20 */
>> FTPSRVERR, FTPRETRINT, FTPRESTFAIL, URLERROR, FOPENERR,
>> - FOPEN_EXCL_ERR, FWRITEERR, HOK /* ! */, HLEXC /* ! */, HEOF,
>> + FOPEN_EXCL_ERR, FWRITEERR, HEOF,
>> /* 30 */
>> - HERR, RETROK, RECLEVELEXC, FTPACCDENIED /* ! */, WRONGCODE,
>> + HERR, RETROK, RECLEVELEXC, WRONGCODE,
>> FTPINVPASV, FTPNOPASV, CONTNOTSUPPORTED, RETRUNNEEDED, RETRFINISHED,
>> /* 40 */
>> - READERR, TRYLIMEXC, URLBADPATTERN /* ! */, FILEBADFILE /* ! */, RANGEERR,
>> - RETRBADPATTERN, RETNOTSUP /* ! */, ROBOTSOK /* ! */, NOROBOTS /* ! */,
>> - PROXERR,
>> + READERR, TRYLIMEXC, FILEBADFILE, RANGEERR,
>> + RETRBADPATTERN, PROXERR,
>> /* 50 */
>> AUTHFAILED, QUOTEXC, WRITEFAILED, SSLINITFAILED, VERIFCERTERR,
>> UNLINKERR, NEWLOCATION_KEEP_POST, CLOSEFAILED, ATTRMISSING, UNKNOWNATTR,
>
> this change will screw all the numeric comments :( I don't see anyway
> why we need to maintain these, so feel free to drop them in this same
> patch and push the result.
I'll do that. In fact I intended to do so already but don't know why I
didn't. I also wanted to regroup them based on functionality so that
we may start merging some of them and eliminating the others.
>
> Regards,
> Giuseppe
--
Thanking You,
Darshit Shah
- [Bug-wget] [Bug-Wget] Misc. patches, Darshit Shah, 2014/07/05
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Giuseppe Scrivano, 2014/07/05
- Re: [Bug-wget] [Bug-Wget] Misc. patches,
Darshit Shah <=
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Giuseppe Scrivano, 2014/07/05
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Darshit Shah, 2014/07/05
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Tim Ruehsen, 2014/07/07
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Darshit Shah, 2014/07/07
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Tim Ruehsen, 2014/07/07
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Tim Ruehsen, 2014/07/07
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Darshit Shah, 2014/07/20
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Tim Rühsen, 2014/07/20
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Darshit Shah, 2014/07/21
- Re: [Bug-wget] [Bug-Wget] Misc. patches, Giuseppe Scrivano, 2014/07/21