bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort: Parallel merging


From: Shaun Jackman
Subject: Re: sort: Parallel merging
Date: Wed, 17 Feb 2010 15:16:49 -0800

On Wed, 2010-02-17 at 14:57 -0800, Chen Guo wrote:
> > >     As for buffer size, I highly doubt using 8 mb, even if we're magically
> > > guaranteed to get 100% of the cpu cache, would work better than a larger
> > > buffer.
> > > 
> > >     The main reason would be for larger files, you'd have to repeatedly 
> > > write
> > > temporary files out to disk, then merge those temporary files. Whatever
> > > time you save talking to cache is more than lost to the extra time talking
> > > to disk.
> > 
> > What if the temporary files were stored in RAM (i.e. tmpfs) rather than
> > on magnetic disk?
> 
> I think I'm misunderstanding what you're trying to say... But the file stored
> in ram would be in a buffer. --buffer-size sets the size of this buffer, i.e. 
> how
> much space in RAM you want to allocate to sort.

I'm suggesting setting the buffer size to the size of the CPU cache; the
sort process has 100% CPU affinity, i.e. no other processes allowed on
that CPU and so exclusive use of the data cache; and the temporary
directory is mounted on RAM (i.e. tmpfs) and not magnetic disk.

sort --buffer-size=8M --temporary-directory=/dev/shm

If the merging is parallel, under these circumstances, is it possible
that --buffer-size=8M could be faster than a larger value.

Cheers,
Shaun






reply via email to

[Prev in Thread] Current Thread [Next in Thread]