[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Invalid Content-Length header in WARC files, on some plat
From: |
Tim Ruehsen |
Subject: |
Re: [Bug-wget] Invalid Content-Length header in WARC files, on some platforms |
Date: |
Wed, 14 Nov 2012 09:11:31 +0100 |
User-agent: |
KMail/1.13.7 (Linux/3.2.0-4-amd64; KDE/4.8.4; x86_64; ; ) |
Am Tuesday 13 November 2012 schrieb David Ryskalczyk:
> I found the bug in the first place after using wget in WARC mode on
> ARM and PPC systems and having trouble extracting the files.
>
> I believe the issue stems from this line in warc.c:
>
> if (! asprintf (&content_length, "%ld", ftello (data_in)))
>
> ftello returns a value of type off_t, which can be 32 or 64 bits wide.
> %ld is the format specifier for a long, and a long is 32 bits on
> 32-bit platforms but 64 bits on 64-bit platforms. On Windows a long is
> 32-bits, whether the platform is 32-bit or 64-bit.
One correct possibility would be casting ftello() to long long or maxint_t (or
to another at least 64 bit int ) and use %lld / %jd. I tend to use long long.
if (asprintf (&content_length, "%lld", (long long)ftello (data_in)) ==
-1)
> What confused me here is that this works fine with Intel 32-bit x86,
> at least on Mac OS X and Linux. It does not work at all with 32-bit
> PowerPC or 32-bit ARM.
>
> I'm fairly certain that the configure script for wget sets
> -D_LARGEFILE_SOURCE -D _FILE_OFFSET_BITS=64 (or whatever is necessary
> for the platform) unless --disable-largefile is specified.
>
> Again, the main reason I'm a bit confused here is because I can't
> trigger this issue on 32-bit Intel platforms.
Being certain is not enough. Please make shure.
First, read the man ftello. _FILE_OFFSET_BITS should be mentioned there.
Than give us the output of wget --version.
And you could printf the sizeof(off_t) when compiling with
-D_LARGEFILE_SOURCE -D _FILE_OFFSET_BITS=64.
Here is a little test program.
#include <stdio.h>
void main(void) { printf("%zd\n",sizeof(off_t)); }
How we fix the off_t things depends on the result.
> There's also another error that's less critical — asprintf returns -1
> if it fails, and the number of bytes printed if it succeeds — not 0 if
> it fails. This shouldn't cause the current problem though.
Yes, that is a bug and should be fixed.
>
>
> --Dave
Tim
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, (continued)
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Stephanie Rühsen, 2012/11/24
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Ángel González, 2012/11/24
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Giuseppe Scrivano, 2012/11/25
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Tim Ruehsen, 2012/11/26
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Ángel González, 2012/11/26
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Giuseppe Scrivano, 2012/11/25
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Ángel González, 2012/11/26
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Giuseppe Scrivano, 2012/11/26
- Re: [Bug-wget] [PATCH] Invalid Content-Length header in WARC files, on some platforms, Tim Ruehsen, 2012/11/27
Re: [Bug-wget] Invalid Content-Length header in WARC files, on some platforms, David Ryskalczyk, 2012/11/13
- Re: [Bug-wget] Invalid Content-Length header in WARC files, on some platforms,
Tim Ruehsen <=