qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 3/4] raw-posix: Fix try_seek_hole()'s handlin


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v2 3/4] raw-posix: Fix try_seek_hole()'s handling of SEEK_DATA failure
Date: Thu, 13 Nov 2014 08:29:06 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

On 11/13/2014 07:52 AM, Eric Blake wrote:
> On 11/13/2014 06:03 AM, Kevin Wolf wrote:
>> Am 13.11.2014 um 11:17 hat Markus Armbruster geschrieben:
>>> When SEEK_HOLE tells us we're in a hole, we try SEEK_DATA to find its
>>> end.  When that fails, we pretend the hole extends to the end of file.
>>> Wrong.
>>
>> Wrong only in some cases, see below.
>>
>>> Except when SEEK_END fails, we screw up and claim it extends
>>> to offset -1.  More wrong.
> 
> 
>>> +++ b/block/raw-posix.c
>>> @@ -1494,8 +1494,9 @@ static int try_seek_hole(BlockDriverState *bs, off_t 
>>> start, off_t *data,
>>>      } else {
>>>          /* On a hole.  We need another syscall to find its end.  */
>>>          *data = lseek(s->fd, start, SEEK_DATA);
>>> -        if (*data == -1) {
>>> -            *data = lseek(s->fd, 0, SEEK_END);
>>> +        if (*data < 0) {
>>> +            /* no idea where the hole ends, give up (unlikely to happen) */
>>
>> Not quite unlikely. If the file ends with a sparse area, we'll get
>> -1/ENXIO here.
>>
>> lseek() with SEEK_DATA starting in a hole when there is no data until
>> EOF is actually the part that isn't documented in the man page, but
>> ENXIO is what I'm seeing here on RHEL 7.
> 
> Here's the (proposed) POSIX wording:
> 
> http://austingroupbugs.net/view.php?id=415
> 
> And ENXIO is indeed the expected error for SEEK_DATA on a trailing hole,
> so maybe we should special case it.
> 

Uggh.  Historical practice on Solaris (and therefore the POSIX wording)
says that SEEK_HOLE in a trailing hole is allowed (but not required) to
seek to EOF instead of reporting the offset requested.  I have no clue
why this was done, but it is VERY annoying - it means that if you
provide an offset within a tail hole of a file, you cannot reliably tell
if the file ends in a hole or with data, without ALSO trying SEEK_DATA.
 For applications that are reading a file sequentially but skipping over
holes, this behavior is fine (it short-circuits the hole/data search
points and might shave an iteration off a lop).  But for OUR purposes,
where we are merely trying to ascertain whether we are in a hole, we
have an inaccurate response - since SEEK_HOLE does NOT return the offset
we passed in, we are prone to treat the offset as belonging to data,
which is a pessimization (you never get wrong results by treating a hole
as data and reading it, but it is definitely slower).

I think you HAVE to call lseek() twice, both with SEEK_HOLE and with
SEEK_DATA, if you want to accurately determine whether an offset happens
to live within a trailing hole.

(By the way, I really wish Solaris had implemented a variant that
queried, but did NOT change the file offset - maybe Linux can add that
as an extension, and give it sane semantics of not special casing
trailing holes...)

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]