parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parallel: This should not happen. You have found a bug.


From: Linda Walsh
Subject: Re: parallel: This should not happen. You have found a bug.
Date: Mon, 12 Aug 2013 23:54:38 -0700
User-agent: Thunderbird



Ole Tange wrote:
On Mon, Aug 12, 2013 at 1:05 AM, Linda Walsh <gnu@tlinx.org> wrote:

When run in --semaphore mode GNU Parallel uses hard links to create a
counting semaphore. If your filesystem does not support hard links the
behaviour is undefined - and that is likely what you see.
NTFS supports hard links.  However, file accesses are infinitely slower than
semaphores in memory.

Ideally, the program could test to see if the system has
semaphores (they work on cygwin and linux and any posix compatible
using the POSIX::RT::Semaphore package.

It has a heavy downside: It seems not to be installed by default on:

  aix centos debian dragonfly freebsd hpux hurd irix mandriva minix miros
  netbsd openbsd openindiana qnx raspberrypi redhat scosysv solaris suse
  tru64 ubuntu ultrix unixware
---
But the perl language is implemented on those platforms and contains
semctl, semget, semop -- Just that the posix:rt:semaphore is supposed to
be more portable.  But if you are limited to perl... you could use the
builtin primitives.

The program I use my semaphores with is on suse.  and the above routine
theoretically works as well or better than the perl builtins on the
platforms you mention.

But the idea, really, is  not to replace the file semaphore entirely.
That's a reliable backstop against more advanced methods.

The idea would be to *try* to load more advanced methods  -- either  the
above lib routines from cpan, or the builtins.  If they fail, fall back
to file methods.  Similarlyy, mailbox access loves advanced locking,
but if not, it goes to mkdir methods.    TCP loves large window sizes
and packets, but if they are not available it falls back to to lower
perf methods ...

In each case, try more advance routines, and gracefully degrade if facilities are not available.

To start at the lowest common denominator would damage the majority who don't have the lowest common denominator.
GNU Parallel has a design goal to not require extra packages
installed. Installation should even on older systems simply be copy
the file 'parallel' to the path.
===

   But does gnu parallel have a design goal not to use more advanced
facilities if they are available?   I know of no other Gnu util that
takes that route.   Gnu optimizes to the platform -- and falls back to
lowest common denominator when it has to, but never as a first choice.


GNU Parallel is designed to work on multiple computers at the same
time if $HOME/.parallel is shared between the computers. So sem should
block on computerA if computerB has the semaphore (e.g. if they both
try to access the same file on an NFS based dir). A semaphore that is
only respected by computerA is thus not backwards compatible with the
current design.
   Most computers support semaphores.  They go back to the beginnings of
perl and are builtin to it. The newer methods were designed to be more portable.
   Mail has always said it is a bad idea to rely on locking on NFS because
it is so slow AND has generally been unreliable. An option could force the use of shared-network based locking, but to use that by default at the speed penalties... I've used it on linux and never noticed speed problems, but it
everything is more difficult on windows. ;-)

You know what's funny -- I went to semaphores in the code I sent after
I saw "sem" and thought it used semaphores... and thought a counting
semaphore was a more reliable solution than my child process counting
(it's not really -- it was more overhead), I used before to control
concurrency.

Compared to the other 24 platforms Windows in general has more quirks
and has always been supported in GNU Parallel as a nice-to-have
platform - not a need-to-have. Not having ssh access to a
Windows/Cygwin environment also makes development for that platform
harder.

Compromising the design goal on at least 24 platforms is not worth it
to me to get --semaphore to work better on Windows.
Yeah, using a design for the lowest common denominator like those 24
platforms, seem to require, is certainly easier.

Having something that adopts multiple methods like most other gnu products
would be more difficult.  You make it sound like those 24 platforms are
some how more weighty than cygwin on windows -- when the market of cygwin
on windows is growing and recommended by MS in the Win8 world to support
posix apps on windows....   As for the 24 platforms..it's really
going to depend on what filesystem they are running.   You work equally
well on all file systems on them?   Or do you have partial support?
CIFS is likely more commonly used than NFS on linux systems... it runs
faster.  Dunno about the others.

If we can find a way that will work fast on Windows/Cygwin without
compromising the 1-file-install design goal and the shared $HOME
semaphore, I will definitely be interested.
Hey, I like 1 file installs... you should see the stuff I've gone through
on mailing lists and perlmonks for designing many of my progs to be 1
file with multiple modules/packages/file, but that doesn't mean you can't
include what you need in 1 file... ;-)

anyway, the semxxx stuff is builtin to perl, so no libs required for that.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]