Re: [gnulib-tool-py] func_modules_transitive

gnulib-tool-py

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnulib-tool-py] func_modules_transitive_closure

From:	Bruno Haible
Subject:	Re: [gnulib-tool-py] func_modules_transitive_closure
Date:	Fri, 29 Jun 2012 21:28:29 +0200
User-agent:	KMail/4.7.4 (Linux/3.1.10-1.9-desktop; KDE/4.7.4; x86_64; ; )

Hi Dmitriy,

> I'm a bit confused with func_modules_transitive_closure function,
> especially with its conditions part. Previous experience has shown that it
> would be better not to write the exact copy of the code, but do what is
> supposed to be done. Could you, please, tell what you expect in this
> section?

OK. Let me explain it in three steps, with increasing complexity:

1) Ignoring --with[out]-*tests options and conditional dependencies.
2) Ignoring conditional dependencies.
3) The full variant.

1) The input is a set L of modules, the output is the smallest set C
of modules that contains the input set and that is closed under
dependencies. "Closed under dependencies" means:
if m ∈ C and o ∈ dependencies(m), then o ∈ C.

In the original code, lists are used for representing the sets. In
Python, you can use hashed sets (unordered) if it fits better than lists.

The basic algorithm is to start with the set L, pick an element m
and one of its dependencies o, look whether o is already in the set, and
if not, add o to the set. Then repeat, repeat, repeat...

This algorithm is modified to cope with two problems:
  - You need to know when to stop.
  - Adding an element to a set over which you are currently iterating is
    normally a forbidden operation (or may produce undefined behaviour).
    Namely, the "add" operation may reorganize the storage of the set in
    such a way that the iterator does an out-of-bounds access or misses
    an element or similar.

The solution to these problems is to proceed in "rounds": At each moment,
you have three lists:
  - the list of modules whose dependencies have already been looked up,
    accumulated from previous rounds (called 'handledmodules' in the code),
  - the list being traversed (called 'inmodules' in the code),
  - the list of new dependencies being accumulated (called 'outmodules'
    in the code).
Once a round is done, the contents of 'inmodules' is put into
'handledmodules', the 'outmodules' becomes the 'inmodules' (modulo those
elements that have already been handled), and the 'outmodules' are
emptied.

You can convince yourself that with this algorithm,
  - the resulting set is closed under dependencies,
  - no module has been added other than because it was a direct or indirect
    dependency,
  - for every module, the dependencies list was processed only once (hence
    it has optimal performance).

2) Now, a variation that respects the --with[out]-*tests options.

It is the same algorithm, except that the computation of the dependencies
  a) includes the test module,
     If tests are included then we consider that m has a dependency on m-tests
     (if m-tests exists).
  b) depends on the options and on the status of the module. This code does it:

            # Determine whether to include the dependency or tests module.
            inc=true
            for word in `func_get_status $dep`; do
              case "$word" in
                obsolete)
                  test -n "$incobsolete" \
                    || inc=false
                  ;;
                c++-test)
                  test -z "$excl_cxx_tests" \
                    || inc=false
                  test -n "$fmtc_inc_all_tests" || test -n "$inc_cxx_tests" \
                    || inc=false
                  ;;
                longrunning-test)
                  test -z "$excl_longrunning_tests" \
                    || inc=false
                  test -n "$fmtc_inc_all_tests" || test -n 
"$inc_longrunning_tests" \
                    || inc=false
                  ;;
                privileged-test)
                  test -z "$excl_privileged_tests" \
                    || inc=false
                  test -n "$fmtc_inc_all_tests" || test -n 
"$inc_privileged_tests" \
                    || inc=false
                  ;;
                unportable-test)
                  test -z "$excl_unportable_tests" \
                    || inc=false
                  test -n "$fmtc_inc_all_tests" || test -n 
"$inc_unportable_tests" \
                    || inc=false
                  ;;
                *-test)
                  test -n "$fmtc_inc_all_tests" \
                    || inc=false
                  ;;
              esac
            done
            if $inc ...

      I hope this piece of code is clear: We start with the assumption that
      we have to include the dependency. But if the dependency is marked
      'obsolete' and obsolete tests are to be not included, then we don't
      include it. If the dependency is marked 'c++-test' and C++ tests
      are not to be included, then we don't include it. And so on.

      Note that this filtering applies only to the dependencies, not to
      modules that were listed among the set of input modules. If a module
      is explicitly listed as an element of the input set, it is always
      included, regardless whether it is obsolete or not, regardless whether
      it is a test or not, regardless of options. (The purpose is that the
      user can override some of the filtering.)

3) The extended variant also considers conditional dependencies. Same
algorithm, but it updates conddep_dependers, conddep_condition to keep
track of the condition under which a module is considered to be present.

Remember these conditions are strings in shell script syntax (they will
be executed as part of a 'configure' run).

              if test "$cond_dependencies" = true; then
                escaped_dep=`echo "$dep" | sed -e "$sed_escape_dependency"`
                sed_extract_condition1='/^ *'"$escaped_dep"' *$/{
                  s/^.*$/true/p
                }'

Among the dependencies, a module name that stands alone in a line is
unconditional.

                sed_extract_condition2='/^ *'"$escaped_dep"' *\[.*\] *$/{
                  s/^ *'"$escaped_dep"' *\[\(.*\)\] *$/\1/p
                }'

Otherwise the condition is noted in brackets. (See modules/openat for an
example.)

                condition=`func_get_dependencies $module | sed -n -e 
"$sed_extract_condition1" -e "$sed_extract_condition2"`
                if test "$condition" = true; then
                  condition=
                fi
                if test -n "$condition"; then
                  func_conddep_add_module "$module" "$dep" "$condition"
                else
                  if $conditional; then
                    func_conddep_add_module "$module" "$dep" true
                  else
                    func_uncond_add_module "$dep"
                  fi
                fi

This code adds the condition to the tables, so that the code that emits
the .m4 file can extract it later.

              fi


Bruno

[Prev in Thread]

Current Thread

[Next in Thread]

[gnulib-tool-py] func_modules_transitive_closure, Dmitriy Selyutin, 2012/06/29
- Re: [gnulib-tool-py] func_modules_transitive_closure, Bruno Haible <=
  - Re: [gnulib-tool-py] func_modules_transitive_closure, Dmitriy Selyutin, 2012/06/30

Prev by Date: Re: [gnulib-tool-py] Calculating md5sum
Next by Date: [gnulib-tool-py] Dependencies contain comments and variables
Previous by thread: [gnulib-tool-py] func_modules_transitive_closure
Next by thread: Re: [gnulib-tool-py] func_modules_transitive_closure
Index(es):
- Date
- Thread