coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: overly aggressive memory usage by sort.c


From: Jim Meyering
Subject: Re: overly aggressive memory usage by sort.c
Date: Mon, 11 Jun 2012 08:30:51 +0200

Pádraig Brady wrote:
> On 06/07/2012 09:12 PM, Jeff Janes wrote:
>> In commit a507ed6ede5064b8f15c979e54e6de3bb478d73e, first appearing in
>> v8.16, the default memory usage was changed to take all of available
>> memory, rather than half of it.  I think this is too aggressive.
>>
>> If I start a very large file sort on a previously idle server, it will
>> report almost all physical memory as being available, and so sort will
>> take all of it.  But as soon as the heavy IO (reading the data to be
>> sorted, writing temp files) starts up, the kernel needs more memory
>> for buffering in order to make the IO efficient.  The kernel and the
>> sort start competing for memory, a little bit of paging/swapping
>> starts, time in iowait increases, and the overall sort performance
>> drops by roughly a factor of 2.
>>
>> I don't know what the correct proportion of available memory to take
>> would be, but I think it is >0.5 and <1.0.  Maybe 0.75.  But I think
>> that just going back to 0.5 would be better than the status quo.  Or
>> perhaps the upper limit clamp could be based on physical memory
>> instead of available, so rather than:
>>
>> mem = MAX (avail, total / 8);
>>
>> maybe:
>>
>> mem = MIN(total/4*3, MAX (avail, total / 8));
>
> I have to agree. In general patches like this shouldn't
> go in without extensive performance testing.
>
> The thread discussing the patch is here:
> http://bugs.gnu.org/10877
>
> There are other things we might consider with external files.
>
> - ensure they're written to disk rather than ram.
> I.E. avoid /tmp if it's tmpfs as is becoming more common on systems
>
> - use posix_fadvise as is done in dd, to mark the external files
> as non cachable, as the only reason you'd be using them is when
> you don't have enough RAM anyway.
>
> The above considers a 2 level memory hierarchy.
> Increasingly though the "memory wall" is an issue,
> so a 3 level hierarchy involving increasingly large CPU caches
> should be considered.

Hi Paul,

Any objection to reverting that and adjusting the comment to match?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]