Re: a saner bootstrap script [Was Re: I fixed this once already]

bug-gnulib
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: a saner bootstrap script [Was Re: I fixed this once already]

From:	Stefano Lattarini
Subject:	Re: a saner bootstrap script [Was Re: I fixed this once already]
Date:	Sun, 25 Sep 2011 17:55:45 +0200
User-agent:	KMail/1.13.7 (Linux/2.6.30-2-686; KDE/4.6.5; i686; ; )
On Thursday 22 September 2011, Gary V wrote:
> Anyone:
> 
> It's beginning to look as though all this work is, once again, in very
> real danger of just slipping quietly through the cracks.
> 
Hi Gary.  While I don't pretend to really understand the scope and purposes
of your script, I might anyway throw in some few observations, ideas and
nits; hopefully they will be helpful.

> #! /bin/sh
> # Print a version string.
> scriptversion=2011-08-14.19; # UTC
> 
> # Bootstrap this package from checked-out sources.
> # Written by Gary V. Vaughan, 2010
> 
> # Copyright (C) 2010 Free Software Foundation, Inc.
>
2011 too now :-)

>
> [SNIP]
>
> # Please report bugs or propose patches to address@hidden
>
What about copying here the autoconf-provided code fragment to improve
the shell bourne-compatibility?

 # Be more Bourne compatible
 DUALCASE=1; export DUALCASE # for MKS sh
 if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then :
   emulate sh
   NULLCMD=:
   # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which
   # is contrary to our usage.  Disable this feature.
   alias -g '${1+"$@"}'='"$@"'
   setopt NO_GLOB_SUBST
 else
   case `(set -o) 2>/dev/null` in *posix*) set -o posix ;; esac
 fi

Or maybe this is too paranoid, and of no real practical importance?

> : "${CONFIG_SHELL=/bin/sh}"
>
Defined but not used (nor exported).

> ## -------------------- ##
> ## Shell normalisation. ##
> ## -------------------- ##
> 
> # NLS nuisances.
> LANGUAGE=C
> export LANGUAGE
>
> # Ensure file names are sorted consistently across platforms.
> LC_ALL=C
> export LC_ALL
> 
> # CDPATH.
> (unset CDPATH) >/dev/null 2>&1 && unset CDPATH
>
If you are not using `set -e`, you could just call "unset CDPATH" here IMHO.

> # func_append VAR VALUE
> # ---------------------
> # Append VALUE onto the existing contents of VAR.
> if (eval 'x=a; x+=" b"; test "x$x" = "xa b"') 2>/dev/null
> then
>   # This is an XSI compatible shell, allowing a faster implementation...
>   eval 'func_append ()
>   {
>     $debug_cmd
> 
>     eval "$1+=\$2"
>   }'
> else
>   # ...otherwise fall back to using expr, which is often a shell builtin.
>   func_append ()
>   {
>     $debug_cmd
> 
>     eval "$1=\$$1\$2"
>   }
> fi
>
Why the empty line after '$debug_cmd'?  IMHO it does not improve readability,
but only wastes vertcal space.  Ditto for other "one-liner" functions
below (e.g., 'func_hookable' and 'func_remove_hook'), but *not* for longer
functions (e.g., 'func_add_hook' and 'func_run_hooks')

> # func_add_hook FUNC_NAME HOOK_FUNC
> # ---------------------------------
> # Request that FUNC_NAME call HOOK_FUNC before it returns.  FUNC_NAME must
> # first have been declared "hookable" by a coll to `func_hookable'.
> func_add_hook ()
> {
>     $debug_cmd
> 
>     case " $hookable_funcs " in
>       *" $1 "*) ;;
>       *) func_fatal_error "Error: \`$1' does not accept hook functions." ;;
>
It is my understanding that GNU software usually preferes to avoid gratuitous
capitalization in the error messages.  More instances throughout the whole
script.

> # func_run_hooks FUNC_NAME [ARG]...
> # ---------------------------------
> # Run all hook functions registered to FUNC_NAME.
> func_run_hooks ()
> {
>     $debug_cmd
> 
>     case " $hookable_funcs " in
>       *" $1 "*) ;;
>       *) func_fatal_error "Error: \`$1' does not support hook funcions.n" ;;
>     esac
>
Code duplication with 'func_add_hook' above, and with many other functions
throughout the script.  Couldn't the logic to determine whether an item is
already present in a space-separated list be factored out in its own
subroutine?  E.g.,

  # Usage: is_item_in_list ITEM [LIST ..]
  is_item_in_list ()
  {
    item=$1; shift;
    case " $* " in *" $item "*) return 0;; *) return 1;; esac
  }

Such a refactoring could as well be done in a follow-up patch, though
(in fact, I'd personally prefer to do it that way).
 
>     eval hook_funcs="\$$1_hooks"
> 
>     # shift away the first argument (FUNC_NAME)
>     shift
>     func_run_hooks_result=${1+"$@"}
>
>     for hook_func in $hook_funcs; do
>       eval $hook_func '"$@"'
>
Useless use of eval I think, since $hook_func is not expected to contain
spaces nor metacharacters.  Simply using:
  $hook_func "$@"
should be enough.

>       # store returned options list back into positional
>       # parameters for next `cmd' execution.
>       eval set dummy "$func_run_hooks_result"; shift
>
This (together with the `func_run_hooks_result' assignement above) will
not proprly preserve spaces in positional arguments:

  $ bash -c '(set "x  y"; a="$@"; eval set dummy "$a"; shift; echo "$@")'
  x y

I don't know whether this matters or not, but is something to consider.

> # func_update_translations
> # ------------------------
> # Update package po files and translations.
> func_hookable func_update_translations
> func_update_translations ()
> {
>     $debug_cmd
> 
>     $opt_skip_po || {
>       test -d po && {
>         $require_package
> 
>         func_update_po_files po $package || exit $?
>       }
> 
>       func_run_hooks func_update_translations
>     }
> }
>
I personally find the code flow here quite unclear.  Something like
this would be clearer IMHO:

  $opt_skip_po && return
  if test -d po; then
    $require_package
    func_update_po_files po $package || exit $?
  fi
  func_run_hooks func_update_translations

The same goes for similar usages throughout the script.

> # func_gnulib_tool
> # ----------------
> # Run `gnulib-tool' to fetch gnulib modules into the current package.
> # To avoid conflics between `autopoint', `libtoolize' and `gnulib-tool',
> # this default implementation first tricks `gnulib-tool' into copying
> # the modules into a temporary directory, and then mangles the contents
> # as they copied to their final destination.
> #
> # It's assumed that since you are using gnulib's `bootstrap' script,
> # you're also using gnulib elsewhere in your package.  If not, then
> # replace this function in `bootstrap.conf' with:
> #
> #   func_gnulib_tool () { :; }
> #
> func_hookable func_gnulib_tool
> func_gnulib_tool ()
> {
>     $debug_cmd
> 
>     $require_gnulib_tool
>     $require_libtoolize
> 
>     test true = "$gnulib_tool" || {
>       if test -n "$gnulib_modules"; then
>
Again, a minor style nit: I'd prefer the use of `if' over `||' and `&&'
when the body of the conditional action is not a simple one-liner; for
example, here you should IMHO do:

   if test x"$gnulib_tool" != x"true"; then
    ...
   fi
   func_run_hooks func_gnulib_tool
   ...

This would also be more consistent with the syle used by the gnulib-tool
script.

The same applies to similar usages throughout the script, for example ...

> # func_libtoolize
> # ---------------
> # If this package uses libtool, then run 'libtoolize'.
> func_libtoolize ()
> {
>     $debug_cmd
> 
>     $require_libtoolize
> 
>     test true = "$LIBTOOLIZE" || {
>       my_libtoolize_options=
>       $opt_copy && func_append my_libtoolize_options " --copy"
>       $opt_force && func_append my_libtoolize_options " --force"
>       $opt_verbose || func_append my_libtoolize_options " --quiet"
>       func_show_eval "$LIBTOOLIZE$my_libtoolize_options" 'exit $?'
>     }
> }
>
... here ...

> # func_install_gnulib_non_module_files
> # ------------------------------------
> # Get additional non-module files from gnulib, overriding existing files.
> func_install_gnulib_non_module_files ()
> {
>     $debug_cmd
> 
>     $require_build_aux
>     $require_gnulib_path
> 
>     test -n "$gnulib_non_module_files" && {
>       if test -n "$gnulib_path"; then
>         maybe_exit_cmd=:
>
... and here ...  And so on.
 
> # require_autobuild_buildreq
> # --------------------------
> # Try to find whether the bootstrap requires autobuild.
> require_autobuild_buildreq=func_require_autobuild_buildreq
> func_require_autobuild_buildreq ()
> {
>     $debug_cmd
> 
>     printf "$buildreq"| func_grep_q '^[        ]*autobuild' || {
>
Missing space before `|` here (cosmetic-only issue).  Also, this use of
printf can cause portability problems with grep, in case the string
"$buildreq" is not newline-terminated.  I would suggest to use this instead:
  
  printf '%s\n' "$buildreq" | func_grep_q '^[    ]*autobuild'

> # require_automake_buildreq
> # -------------------------
> # Try to find the minimum compatible version of automake required to
> # bootstrap successfully, and add it to `$buildreq'.
> require_automake_buildreq=func_require_automake_buildreq
> func_require_automake_buildreq ()
> {
>     $debug_cmd
> 
>     # if automake is not already listed in $buildreq...
>     printf "$buildreq"| func_grep_q automake || {
>       func_extract_trace AM_INIT_AUTOMAKE
> 
>       # ...and AM_INIT_AUTOMAKE is declared...
>       test -n "$func_extract_trace_result" && {
>         automake_version=`echo "$func_extract_trace_result" \
>            |$SED 's|[^0-9]*||; s| .*$||'`
>
Missing space between '|' and '$SED' (cosmetic-only issue).
  
> # require_gnulib_submodule
> # ------------------------
> # Ensure that there is a current gnulib submodule at `$gnulib_path'.
> require_gnulib_submodule=func_require_gnulib_submodule
> func_require_gnulib_submodule ()
> {
>     $debug_cmd
> 
> [SNIP]
>
>       trap func_cleanup_gnulib 1 2 13 15
> 
>       shallow=
>       $GIT clone -h 2>&1 |func_grep_q -- --depth \
>           && shallow='--depth 2'
> 
>       func_show_eval "$GIT clone $shallow '$gnulib_url' '$gnulib_path'" \
>           func_cleanup_gnulib
> 
>       trap - 1 2 13 15
>
Not portable to at least Solaris 10 /bin/sh; quoting the Autoconf manual:
  
  Posix says that "trap - 1 2 13 15" resets the traps for the specified signals 
to
  their default values, but many common shells (e.g., Solaris /bin/sh) 
misinterpret
  this and attempt to execute a "command" named '-' when the specified 
conditions
  arise. Posix 2008 also added a requirement to support "trap 1 2 13 15" to 
reset
  traps, as this is supported by a larger set of shells, but there are still 
shells
  like dash that mistakenly try to execute 1 instead of resetting the traps.
  Therefore, there is no portable workaround, except for "trap - 0", for which
  "trap '' 0" is a portable substitute.

> # require_libtoolize
> # ------------------
> # Skip libtoolize if it's not needed.
> require_libtoolize=func_require_libtoolize
> func_require_libtoolize ()
> {
>     $debug_cmd
> 
>     # Unless we're not searching for libtool use by this package, set
>     # LIBTOOLIZE to true if none of `LT_INIT', `AC_PROG_LIBTOOL' and
>     # `AM_PROG_LIBTOOL' are used in configure.
>     test true = "$LIBTOOLIZE" || {
>       func_extract_trace LT_INIT
>       test -n "$func_extract_trace_result"||func_extract_trace AC_PROG_LIBTOOL
>       test -n "$func_extract_trace_result"||func_extract_trace AM_PROG_LIBTOOL
>       test -n "$func_extract_trace_result"||LIBTOOLIZE=true
>
Missing various spaces here.

> func_require_package_version ()
> {
>     $debug_cmd
> 
>     func_extract_trace AC_INIT
> 
>     save_ifs="$IFS"
>
Useless use of quotes.

>     IFS=:
>     set dummy $func_extract_trace_result
>     IFS="$save_ifs"
>
Ditto.

And many other similar usages throughout the file.

> # func_unset VAR
> # --------------
> # Portably unset VAR.
> func_unset ()
> {
>     { eval $1=; unset $1; }
> }
> unset=func_unset
> 
What is the point of this function, if boostrap does not run under
'set -e'?  Using a simple 'unset' should be good enough, since even
in shells where it returns a non-zero exit status for already-unset
variables, it does not display any error message (at leat not that
I know of).  Or are you just "erring on the side of safety", and
deliberately?

> func_cmp_s ()
> {
>     $CMP "$@" >/dev/null 2>&1
> }
> func_grep_q ()
> {
>     $GREP "$@" >/dev/null 2>&1
>
Why redirecting also the stderr of grep and cmp to /dev/null?  This will
hide potential problems due to e.g., incorrect regular expressions or
non-existent files ...

> # func_strpad STR WIDTH CHAR
> # --------------------------
> # Trim STR, or pad with CHAR to force a total length of WIDTH.
> func_strpad ()
> {
>     $debug_cmd
> 
>     my_width=`expr "$2" - 1`
>     func_strpad_result=`echo "$1" |$SED '...
>
Missing space between '|' and '$SED' (cosmetic only).

Other similar blunders throughout the script.
 
> # func_version
> # ------------
> # Echo version message to standard output and exit.
> func_version ()
> {
>     $debug_cmd
> 
>     cat <<EOF
> $progname $scriptversion
> EOF
>
printf '%s\n' "$progname $scriptversion" ?

Also, would it be worth to try to catch potential write errors here, as
you do in 'func_help' and 'func_usage' (below)?

>     $SED -n '/(C)/!b go
>         :more
>         /\./!{
>           N
>           s|\n# | |
>           b more
>         }
>         :go
>         /^# Written by /,/# warranty; / {
>           s|^# ||
>           s|^# *$||
>           s|\((C)\)[ 0-9,-]*[ ,-]\([1-9][0-9]* \)|\1 \2|
>           p
>         }
>         /^# Written by / {
>           s|^# ||
>           p
>         }' < "$progpath"
> 
>     exit $?
> }
>
Is this complexity really warranted by a function whose purpose is
simply to print a version message?
 
> # func_usage [NOEXIT]
> # -------------------
> # Echo short help message to standard output and exit.
> func_usage ()
> {
>     $debug_cmd
> 
>     echo "$progpath [OPTION]..."
>     echo ""
>     echo "$usage_message"
> 
>     my_ret=$?
>     test -n "$1" || {
>       echo "Run \`$progname --help | more' for full usage"
>       exit $my_ret
>     }
> }
>
What's the point of '$my_ret' here?  Is only for catching potential
write errors?  If yes, I'd do this instead:

  st=0
  echo "$progpath [OPTION]..." \
    && echo "" \
    && echo "$usage_message" \
    || st=$?
  test -z "$1" && return $st
  echo "Run \`$progname --help | more' for full usage" || my_ret=$?
  exit $st

> 
> # func_help [NOEXIT]
> # ------------------
> # Echo long help message to standard output and exit,
> # unless 'noexit' is passed as argument.
> func_help ()
> {
>     $debug_cmd
>     func_usage noexit
>     echo "$long_help_message"
>     my_ret=$?
>     test -n "$1" || exit $my_ret
> }
>
And similarly here:

  st=0
  func_usage noexit \
    && echo "$long_help_message"
    || st=$?
  test -n "$1" || exit $st

Better again, I would rewrite both the 'func_usage' and 'func_help'
subroutines not to call exit explicity (let the callers do that
if they want).

We could then simplify the implementations like this:

  func_usage ()
  {
    $debug_cmd
    echo "$progpath [OPTION]..." \
      && echo "" \
      && echo "$usage_message"
  }

  func_help ()
  {
    $debug_cmd
    func_usage \
      && echo "$long_help_message"
  }

A definite improvement IMHO.

> # func_quote_for_eval ARG...
> # --------------------------
> # Aesthetically quote ARGs to be evaled later.
> # This function returns two values: FUNC_QUOTE_FOR_EVAL_RESULT
> # is double-quoted, suitable for a subsequent eval, whereas
> # FUNC_QUOTE_FOR_EVAL_UNQUOTED_RESULT has merely all characters
> # which are still active within double quotes backslashified.
> sed_quote_subst='s|\([`"$\\]\)|\\\1|g'
> func_quote_for_eval ()
> {
>     $debug_cmd
> 
>     func_quote_for_eval_result=
> 
>     while test $# -gt 0; do
>       case $1 in
>         *[\\\`\"\$]*)
>           my_unquoted_arg=`printf "$1" | $SED "$sed_quote_subst"` ;;
>
Unportable use of sed, in case $1 is not newline-terminated.  Also,
potential problems with printf, in case $1 contains `\' characters:
  $ a='\t'; printf x"$a"x
  $ x       x
Use this instead:
  printf '%s\n' "$1"

There might be similar problems in other part of the script?  I have not
checked in great detail ...  You might want to take a better look.

>         *)
>           my_unquoted_arg="$1" ;;
>       esac

 
> # func_show_eval CMD [FAIL_EXP]
> # -----------------------------
> # Unless opt_silent is true, then output CMD.  Then, if opt_dryrun is
> # not true, evaluate CMD.  If the evaluation of CMD fails, and FAIL_EXP
> # is given, then evaluate it.
> func_show_eval ()
> {
>     $debug_cmd
> 
>     my_cmd="$1"
>     my_fail_exp="${2-:}"
> 
>     ${opt_silent-false} || {
>       func_quote_for_eval $my_cmd
>       eval func_truncate_cmd $func_quote_for_eval_result
>       func_echo "running: $func_truncate_cmd_result"
>     }
> 
>     if ${opt_dry_run-false}; then :; else
>       eval "$my_cmd"
>       my_status=$?
>       if test "$my_status" -eq 0;
>
Useless use of quites around $my_status.    

>     then :; else
>         eval "(exit $my_status); $my_fail_exp"
>       fi
>     fi
> }
> 
 
> func_sort_ver ()  # sort -V is not generally available
> {
>     $debug_cmd
> 
>     ver1="$1"
>     ver2="$2"
> 
>     # split on '.' and compare each component
>
Missing capitalization and final full stop from this comments.
I have noticed similar issues in other comments too, below and
throughout the script.

> # func_get_version APP
> # --------------------
> # echo the version number (if any) of APP, which is looked up along your
> # PATH.
> func_get_version ()
> {
>     $debug_cmd
> 
>     app=$1
> 
>     { $app --version || $app --version </dev/null; } >/dev/null 2>&1 \
>       || return 1
>
What's the point of the second `$app --version'?

>     $app --version 2>&1 |
>     $SED -n '# extract version within line
>              ...'
>
Comments in sed scripts are not portable, sigh :-(

Still, instead of obfuscating all the script for the sake of some ancient
sed implementation, you might want to check at the beginning of the script
that $SED handles comments correctly, and abort if it is not the case
(pointing the user to GNU sed maybe?).

> # func_check_versions APP1 VER1 URL1 ...[APPN VERN URLN]
> # ------------------------------------------------------
> func_check_versions ()
> {
>    [SNIP]
>       # Fail if --version didn't work.
>       if test ! "$my_instver"; then
>
test -z "$my_instver" is clearer and safer IMO.

>         func_error "Error: \`$my_app' not found"
>         func_check_versions_result=false
> 
>       # Fail if a new version than what we have is required.
>       elif test ! "$my_reqver" = "-"; then
>
test "$my_reqver" != "-" is more idiomatic (even if the above should
be very portable as well, as far as I know).

> # func_cleanup_gnulib
> # -------------------
> # Recursively delete everything below the path in the global variable
> # GNULIB_PATH.
> func_cleanup_gnulib ()
> {
>     $debug_cmd
> 
>     status=$?
>
In zsh, `$status' is a reserved read-only variable (equivalent to `$?').
Use `$st' or `$exit_status' instead.

> # func_symlink_to_dir SRCDIR FILENAME [DEST_FILENAME]
> # ---------------------------------------------------
> # Make a symlink from FILENAME in SRCDIR to the current directory,
> # optionally calling the symlink DEST_FILENAME.
> func_symlink_to_dir ()
> {
>     $debug_cmd
>  [SNIP]
>
The body of this function is IMVHO complex enough to deserve some more
comments.
 
> func_update_po_files ()
> {
>     $debug_cmd
>
> [SNIP]
>
>       if ! test -f "$cksum_file"
>
Special `!' command is unportable to at least Solaris 10 /bin/sh:

  $ /bin/sh -c '! test -f /none'
 /bin/sh: !: not found

Possibly other similar usages throughout the script (I haven't checked
in detail).

> dirname='s|/[^/]*$||'
> basename='s|^.*/||'
>
Maybe rename these to `$sed_dirname_cmd' and `$sed_basename_cmd', for
clarity?  (cosmetic-only issue).

> nl='
> '
> 
> # Work around backward compatibility issue on IRIX 6.5. On IRIX 6.4+, sh
> # is ksh but when the shell is invoked as "sh" and the current value of
> # the _XPG environment variable is not equal to 1 (one), the special
> # positional parameter $0, within a function call, is the name of the
> # function.
>
Something similar holds for Zsh if I'm not mistaken (it would be worth
to mention that in this comments IMO).

> progpath="$0"
> 
> # The name of this program:
s/:$/./ ?

> progname=`echo "$progpath" |$SED "$basename"`
> 
> ## -------------------------------------------------- ##
> ## Source package customisations in `bootstrap.conf'. ##
> ## -------------------------------------------------- ##
> 
> # Override the default configuration, if necessary.
> # Make sure that bootstrap.conf is sourced from the current directory
> # if we were invoked as "sh bootstrap".
> case "$0" in
>
Useless use of quote.

>   */*) test -r "$0.conf" && . "$0.conf" ;;
>   *) test -r "$0.conf" && . ./"$0.conf" ;;
> esac
> 
> 
> ## ------------------------------- ##
> ## Actually perform the bootstrap. ##
> ## ------------------------------- ##
> 
> func_bootstrap ${1+"$@"}
> 
> # The End.
> exit $exit_status
> 

Finally, a couple of more general observations:

  1. The bootstrap script is now complex enough to warrant the
     introduction of a testsuite.

  2. IMHO, since bootstrap is a maintainer tool anyway, we should
     assume that a POSIX shell is used to run it, thus allowing
     various simplifications and optimizations.  If we want to be
     helpful towards developer having an inferior default shell,
     we could run a sanity check on the shell itself early in the
     script, warning the user if his shell isn't able to handle
     the required constructs and usages.

HTH, and thanks for all your work on this!

Regards,
  Stefano
[Prev in Thread]
Current Thread
[Next in Thread]
Re: I fixed this once already [Was Re: [PATCH] bootstrap: obey --no-git.], Gary V. Vaughan, 2011/09/07
- a saner bootstrap script [Was Re: I fixed this once already], Gary V. Vaughan, 2011/09/22
  - Re: a saner bootstrap script [Was Re: I fixed this once already], Paul Eggert, 2011/09/22
    - Re: a saner bootstrap script, Gary V. Vaughan, 2011/09/22
    - Re: a saner bootstrap script, Paul Eggert, 2011/09/22
  - Re: a saner bootstrap script [Was Re: I fixed this once already], Stefano Lattarini <=
    - Re: a saner bootstrap script, Gary V. Vaughan, 2011/09/26
    - Re: a saner bootstrap script, Stefano Lattarini, 2011/09/26
    - Re: a saner bootstrap script, Gary V. Vaughan, 2011/09/26
    - Re: a saner bootstrap script, Stefano Lattarini, 2011/09/26
  - Re: a saner bootstrap script [Was Re: I fixed this once already], Stefano Lattarini, 2011/09/25
    - Re: a saner bootstrap script, Gary V. Vaughan, 2011/09/26
    - Re: a saner bootstrap script, Stefano Lattarini, 2011/09/26
    - Re: a saner bootstrap script, Gary V. Vaughan, 2011/09/26
    - Re: a saner bootstrap script, Stefano Lattarini, 2011/09/26
- a saner bootstrap script [Was Re: I fixed this once already], Gary V . Vaughan, 2011/09/22
Prev by Date: Re: [POLL] C99, declaration after statement
Next by Date: Re: [POLL] C99, declaration after statement
Previous by thread: Re: a saner bootstrap script
Next by thread: Re: a saner bootstrap script
Index(es):
- Date
- Thread