bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Work around BSD `join` bug


From: Bruno Haible
Subject: Re: Work around BSD `join` bug
Date: Thu, 11 Jan 2024 12:43:07 +0100

Paul Eggert wrote:
> >    - POSIX violation or not? Is it valid to pass lines with missing fields
> >      to 'join', according to POSIX [1]?
> 
> It should be valid, yes. POSIX 'join' defers to POSIX 'sort' for the 
> definition of fields, and POSIX 'sort' says missing fields should be 
> treated as empty.

Thanks for explaining. Also, POSIX [1] says:
  "Some historical implementations have been encountered where a blank line
   in one of the input files was considered to be the end of the file; the
   description in this volume of POSIX.1-2017 does not cite this as an
   allowable case."

> >> Then, would it make sense to document it in the GNU Autoconf manual? [2]
> 
> Sure, I installed the attached patch to the Autoconf manual.

Thanks!

I see that macOS 12.6, FreeBSD 14.0, and NetBSD 9.3 have the bug, whereas
OpenBSD does not have it (already at least since OpenBSD 3.8, which was
in 2005).

Now, back to gnulib-tool. I'm committing this patch below, that rejects
a broken 'join' program.

It would be possible to obey a variable named JOIN, via "${JOIN-join}"
instead of 'join'. But that adds complexity, and we don't have a variable
named SED in gnulib-tool either.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/join.html


2024-01-11  Bruno Haible  <bruno@clisp.org>

        gnulib-tool: Reject broken 'join' program as seen in macOS, FreeBSD etc.
        Reported by Avinash Sonawane <rootkea@gmail.com> in
        <https://lists.gnu.org/archive/html/bug-gnulib/2024-01/msg00028.html>.
        * gnulib-tool: Move the func_gnulib_dir and func_tmpdir invocations
        ahead. If the 'join' program exists but does not handle missing fields,
        bail out.

diff --git a/gnulib-tool b/gnulib-tool
index b909a81f7a..9facfd2be7 100755
--- a/gnulib-tool
+++ b/gnulib-tool
@@ -894,15 +894,6 @@ func_hardlink ()
   }
 }
 
-# The 'join' program does not exist on all platforms.  Where it exists,
-# we can use it.  Where not, bail out.
-if (type join) >/dev/null 2>&1; then
-  :
-else
-  echo "$progname: 'join' program not found. Consider installing GNU 
coreutils." >&2
-  func_exit 1
-fi
-
 # Ensure an 'echo' command that
 #   1. does not interpret backslashes and
 #   2. does not print an error message "broken pipe" when writing into a pipe
@@ -1071,6 +1062,38 @@ if test "X$1" = "X--no-reexec"; then
   shift
 fi
 
+func_gnulib_dir
+func_tmpdir
+trap 'exit_status=$?
+      if test "$signal" != EXIT; then
+        echo "caught signal SIG$signal" >&2
+      fi
+      rm -rf "$tmp"
+      exit $exit_status' EXIT
+for signal in HUP INT QUIT PIPE TERM; do
+  trap '{ signal='$signal'; func_exit 1; }' $signal
+done
+signal=EXIT
+
+# The 'join' program does not exist on all platforms, and
+# on macOS 12.6, FreeBSD 14.0, NetBSD 9.3 it is buggy, see
+# <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=232405>.
+# In these cases, bail out. Otherwise, we can use it.
+if (type join) >/dev/null 2>&1; then
+  echo a > "$tmp"/join-input-1
+  { echo; echo a; } > "$tmp"/join-input-2
+  if LC_ALL=C join "$tmp"/join-input-1 "$tmp"/join-input-2 | grep a >/dev/null 
\
+     && LC_ALL=C join "$tmp"/join-input-2 "$tmp"/join-input-1 | grep a 
>/dev/null; then
+    :
+  else
+    echo "$progname: 'join' program is buggy. Consider installing GNU 
coreutils." >&2
+    func_exit 1
+  fi
+else
+  echo "$progname: 'join' program not found. Consider installing GNU 
coreutils." >&2
+  func_exit 1
+fi
+
 # Unset CDPATH.  Otherwise, output from 'cd dir' can surprise callers.
 (unset CDPATH) >/dev/null 2>&1 && unset CDPATH
 
@@ -1690,19 +1713,6 @@ func_determine_path_separator
   esac
 }
 
-func_gnulib_dir
-func_tmpdir
-trap 'exit_status=$?
-      if test "$signal" != EXIT; then
-        echo "caught signal SIG$signal" >&2
-      fi
-      rm -rf "$tmp"
-      exit $exit_status' EXIT
-for signal in HUP INT QUIT PIPE TERM; do
-  trap '{ signal='$signal'; func_exit 1; }' $signal
-done
-signal=EXIT
-
 # Note: The 'eval' silences stderr output in dash.
 if (declare -A x && { x[f/2]='foo'; x[f/3]='bar'; eval test '${x[f/2]}' = foo; 
}) 2>/dev/null; then
   # Zsh 4 and Bash 4 have associative arrays.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]