bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] sort: use posix_fadvise to announce access patterns on file


From: Joey Degges
Subject: Re: [PATCH] sort: use posix_fadvise to announce access patterns on files opened for reading
Date: Fri, 26 Feb 2010 21:26:17 -0800

2010/2/26 Pádraig Brady <address@hidden>

> Thanks for the patch.
> The speed up of 1s was what % as a matter of interest?
> I presume the drive was the bottleneck and the CPU was not at 100%?
>

Thank you for the comments.
Averaging a few runs shows 2.92% improvement. The average CPU usage came out
to be 61%. See comments on WILLNEED below.

input file size: 477M
sort_fadvise: 58.43 s
sort_current: 60.19 s


>
>> +/* Announce the access patterns NOREUSE, SEQUENTIAL, and WILLNEED on the
>> +   descriptor FD. Ignore any errors -- this is only advisory.  */
>>
>> +
>> +static void
>> +xfadvise (int fd)
>> +{
>> +#if HAVE_POSIX_FADVISE
>> +  posix_fadvise (fd, 0, 0, POSIX_FADV_NOREUSE);
>> +  posix_fadvise (fd, 0, 0, POSIX_FADV_SEQUENTIAL);
>> +  posix_fadvise (fd, 0, 0, POSIX_FADV_WILLNEED);
>> +#endif
>> +}
>> +
>>
>
> http://lxr.linux.no/#linux+v2.6.33/mm/fadvise.c#L27<http://lxr.linux.no/#linux+v2.6.33/mm/fadvise.c%23L27>
> On the latest linux kernel one can see from the above that...
>
> SEQUENTIAL doubles the read ahead size which seems useful because
> by definition sort needs to read all of its input before outputting
> anything.
>
> NOREUSE used to do the same as WILLNEED, but currently does nothing.
> I guess it could bump down the cache priority of the specified range.
>
> I'm not sure WILLNEED has any effect when passed a 0 len?
> It's mainly useful I'd say for apps specifying sections
> to read ahead which could be a large gain for mechanical
> disks and a subsequent random access pattern.
>

Thank you for the pointer (I was not sure where to find this
implementation). According to the posix_fadvise man page len = 0 should hint
that the entire file should be considered. It appears that this is carried
out here: http://lxr.linux.no/linux+v2.6.33/mm/fadvise.c#L100, nrpages =
~0UL. force_page_cache_readahead then trims that value down to something
more reasonable.

The man page also states that WILLNEED should initiate a non-blocking read,
but the current implementation blocks. While testing I found this to be
true, adding print statements before and after the call show a long delay
that is equivalent to the time it takes to cat the file. This makes me
believe that the entire file is being read ahead all at once.

Due to this massive read ahead CPU is mostly 0% during the first ~25s of
execution and 100% during the later ~35s. In the end the average CPU is 61%.

When only SEQUENTIAL is used the CPU hovers around 5-20% during the first
~25s and then finishes with 100%. Average CPU is 62%. The total execution
time maintains 3% advantage over current.

It is interesting that reading and sorting the data chunk by chunk in a
streamline fashion (SEQUENTIAL) is no faster than initially reading the
entire file from disk and then doing all of the sorting (WILLNEED).

Since the only difference appears to be in the distribution of CPU, which is
more desirable for sort?


> It's interesting that the increased read ahead didn't
> improve anything for your mechanical disk. I suppose the
> default read ahead values are tuned well there already.
> Also it seems like your SSD benefits from the larger read ahead.
> This is echoed in this recent post:
> http://article.gmane.org/gmane.linux.kernel.mm/43753
> which suggests that we soon might get the speedup on SSDs
> without apps needing to specify SEQUENTIAL?
> See also: http://lkml.org/lkml/2010/2/3/27
>
>
The flash drive I was testing with is actually a USB stick.

It is unclear whether the read ahead did improve anything for the mechanical
disk. The execution times varied so widly that any small improvements of 3%
were drowned out. Maybe initial seek time or some other difficult to measure
mechanical latencies are the cause of the high variance? (I do not know much
on this topic)



> So in summary, sequential access is handled quite well
> automatically by linux at least, but this might help.
> Drop the other 2 settings for now I think.


> Codewise you might want probably want to do:
> ignore_value (posix_fadvise (fd, 0, 0, POSIX_FADV_SEQUENTIAL));
>
> cheers,
> Pádraig.
>

Thanks, I will submit the next patch with ignore_value.

Joey


reply via email to

[Prev in Thread] Current Thread [Next in Thread]