[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: feature request: gzip/bzip support for sort

From: Paul Eggert
Subject: Re: feature request: gzip/bzip support for sort
Date: Mon, 15 Jan 2007 22:24:29 -0800
User-agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux)

Thanks very much for looking into that.  Some comments:

Dan Hipschman <address@hidden> writes:

> +  /* This is so the child process won't delete our temp files
> +     if it receives a signal before exec-ing.  */
> +  sigprocmask (SIG_BLOCK, &caught_signals, &oldset);
> +  saved_temphead = temphead;
> +  temphead = NULL;
> +
> +  if ((pid = fork ()) < 0)
> +    error (SORT_FAILURE, errno, _("couldn't fork"));
> +  else if (0 < pid)
> +    {
> +      temphead = saved_temphead;
> +      sigprocmask (SIG_SETMASK, &oldset, NULL);
> +    }
> +  else
> +    {
> +      sigprocmask (SIG_SETMASK, &oldset, NULL);

There is a path out of code that does not restore the signal
mask, which sounds dangerous.  The usual pattern looks
more like this:

  sigprocmask (SIG_BLOCK, &caught_signals, &oldset);
  unlink_status = unlink (name);
  unlink_errno = errno;
  *pnode = next;
  sigprocmask (SIG_SETMASK, &oldset, NULL);

The critical section is as small as possible, and it always restores
the signal mask.  Any tests for whether the critical section succeeded
or not are deferred until after the critical section is over.

> +      close_or_die (STDIN_FILENO, _("standard input"));
> +      close_or_die (STDOUT_FILENO, _("standard output"));

I don't see why we care whether these 'close' calls fail.
The file descriptors are junk at this point.  In general, we
shouldn't test for 'close' failures unless we actually care
about whether the file's I/O went well.  There are other
places where this is an issue.

> +      if ((node->pid = pipe_fork (pipefds)))

These days the usual style for 'if' is:

     node->pid = pipe_fork (pipefds);
     if (node->pid)

> +      pid_t *pids = xnmalloc (nfiles, sizeof *pids);
> +      memset (pids, 0, nfiles * sizeof *pids);

This looks like a case for xcalloc.

> +  while (reap (-1))
> +    continue;

This one has me worried a bit, but perhaps I'm worrying too much.

More thoughts:

I'm not that worried about signal handling being propagated to the
child compress/decompress processes.  Let them worry about their own
signal handling and cleanup.  They shouldn't run for long if we exit.
However, we shouldn't block or mask out signals for them; they should
inherit the same signal mask etc. that 'sort' does.

coreutils doesn't invoke pipe() anywhere else, and it rarely invokes
fork() so we have to be prepared for the usual portability gotchas in
this area.  Could you please look at the gnulib pipe module and see
whether it solves any problems here?  Perhaps coreutils can use it
directly, or add options to it to make it usable.  It does reflect a
lot of implementation wisdom.  Even if we don't use the pipe module we
should at least be aware of the problems it fixes.

vfork is more efficient than fork on many platforms, particularly when
the parent process is large (which can easily be the case here).  It
has some porting issues, though (as you can see if you look at diff3
and/or sdiff).

We'll need documentation, of course (coreutils.texi and NEWS).  The
documentation should reserve space and backslash in the compress
program name, for future use.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]