Aloha,
Congratulations on supporting parsort --parallel. I was wondering why the high number of processes (many files) until reading the 20230222 release notes. Now I understand.
First and foremost, mcesort is simply a parsort variant using mini-MCE parallel engine, integrated into mcesort. I reduced MCE code to the essentials (less than 1,500 lines). The main application is 400 lines. Currently, mcesort is < 1,900 lines.
1) mcesort supports -A (sets LC_ALL=C) and -j, --parallel N, N%, or max (-j12, -j50%, -jmax).
2) currently, mcesort does not allow -S, --buffer-size. From testing, specifying -S or --buffer-size leads to more memory consumption and degrades performance. Is -S --buffer-size helpful from parsort/mcesort perspective?
3) mcesort runs -z --zero-terminated in parallel, unlike parsort consuming one core.
4) mcesort accepts --check, -c, -C, --debug, --merge [--batch-size], and simply passes through and runs sort serially, never returning, if checking or merging sorted input or debugging incorrect key usage.
exec('sort', @ARGV) if $pass_through;
Respectfully, I captured results using parsort 20230222 and mcesort (to be released soon).
#################################################################
~ List of Files (total: 6 * 92 = 552 files) 17 GB Size
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
parsort
~~~~~~~
$ time LC_ALL=C parsort --parallel=64 -k1 \
/dev/shm/big* /dev/shm/big* /dev/shm/big* \
/dev/shm/big* /dev/shm/big* /dev/shm/big* | cksum
867518687 17463513600
1,109 processes created (brief system lockup)
physical memory consumption peak 7.79 GB
real 1m59.565s
user 1m27.735s
sys 0m22.013s
mcesort
~~~~~~~
$ time LC_ALL=C mcesort --parallel=64 -k1 \
/dev/shm/big* /dev/shm/big* /dev/shm/big* \
/dev/shm/big* /dev/shm/big* /dev/shm/big* | cksum
867518687 17463513600
128 processes created (no system lockup, fluid)
1 sort and 1 merge per worker
physical memory consumption peak 2.92 GB
real 1m57.209s
user 21m55.152s
sys 1m15.790s
#################################################################
~ Single File 17 GB
~ cat /dev/shm/big* >> /dev/shm/huge (6 times)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
parsort
~~~~~~~
$ time LC_ALL=C parsort --parallel=64 -k1 /dev/shm/huge | cksum
867518687 17463513600
128 processes created (no system lockup, fluid)
physical memory consumption peak 2.90 GB
real 2m11.056s
user 1m39.646s
sys 0m22.040s
mcesort
~~~~~~~
$ time LC_ALL=C mcesort --parallel=64 -k1 /dev/shm/huge | cksum
867518687 17463513600
128 processes created (no system lockup, fluid)
physical memory consumption peak 2.83 GB
real 1m53.255s
user 23m52.807s
sys 0m58.450s
#################################################################
~ Standard Input
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
parsort
~~~~~~~
$ time cat \
/dev/shm/big* /dev/shm/big* /dev/shm/big* \
/dev/shm/big* /dev/shm/big* /dev/shm/big* \
| LC_ALL=C parsort --parallel=64 -k1 | cksum
867518687 17463513600
193 processes created (no system lockup, fluid)
physical memory consumption peak 3.05 GB
real 2m18.442s
user 1m39.051s
sys 0m27.548s
mcesort
~~~~~~~
$ time cat \
/dev/shm/big* /dev/shm/big* /dev/shm/big* \
/dev/shm/big* /dev/shm/big* /dev/shm/big* \
| LC_ALL=C mcesort --parallel=64 -k1 | cksum
867518687 17463513600
128 processes created (no system lockup, fluid)
physical memory consumption peak 2.75 GB
real 1m57.487s
user 22m16.476s
sys 1m15.481s