gnulib-tool-py
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnulib-tool-py] GLModule, GLFileSystem, modcache and func_cache_loo


From: Bruno Haible
Subject: Re: [gnulib-tool-py] GLModule, GLFileSystem, modcache and func_cache_lookup_module
Date: Mon, 25 Jun 2012 20:01:10 +0200
User-agent: KMail/4.7.4 (Linux/3.1.10-1.9-desktop; KDE/4.7.4; x86_64; ; )

Hi Dmitriy,

> Sorry for being annoying with questions, but I really need your help.

No need to be sorry about it. Your questions are a sign that you are
progressing :-)

> It is hard to understand what happens func_cache_lookup_module.
> What does mean $have_associative? I've found such lines of the code:
> 
> if (declare -A x && { x[f/2]='foo'; x[f/3]='bar'; eval test
> '${x[f/2]}' = foo; }) 2>/dev/null; then
>   # Zsh 4 and Bash 4 have associative arrays.
>   have_associative=true
> else
>   # For other shells, use 'eval' with computed shell variable names.
>   have_associative=false
> fi
> 
> As I can understand, it is the same as Python's dict object, which has
> pair of key and value for each member of the array.

Yes, the shell code is building up a hash table. In Python, a 'dict'
object comes closest. A simple mapping from keys to values.

In some cases, namely when the keys are in Python represented through
objects, the hash table can be replaced by an extra field in each of the
key objects.

> If I am right,
> this implements something like this: if module was already searched,
> then add it to the list of already searched modules. If not, try to
> find it again. However, it is still hard to understand what is cached:
> module name?

To understand what the code does, simply ignore the code parts where
$have_associative is false - since this is an ugly workaround for the
case where $have_associative is true.

There are 5 'declare -A' statements in total, meaning 5 hash tables:

1) modcache_cached
   The key is a module name. The value is always 'yes' (for those keys
   that are present in the table).

   So, this data structure is a hashed set. Semantically the same as an
   unordered list of module names, but with an O(1) lookup.

   In Python there are two ways of implementing it:
     - A 'dict' with True/False values.
     - A field of boolean type (True/False) in the GLModule class.

2) conddep_isuncond
     The key is a module name. The value is always 'true' (for those keys
     that are present in the table).
     So, it's a hashed set as well. Again, there are two ways to do it in
     Python:
       - A 'dict' with True/False values.
       - A field of boolean type (True/False) in the GLModule class.

   conddep_dependers
     The key is a module name. The value is a space-separated list of
     module names.
     In Python, the key becomes a GLModule instance, and the value an
     (ordered) list of GLModule instances.
     Again, there are two ways to do it in Python:
       - A 'dict' whose values are lists of GLModule instances.
       - A field in the GLModule class, whose type is list of GLModule.

   conddep_condition
     The key is a pair of module names. The value is a string (a piece
     of shell script code, that will be emitted into a generated .m4 file).
     In Python, the key becomes a 2-tuple of GLModule objects, I think.
     For this one, the 'dict' is the best choice, I would say.

3) to_remove
     This occurs inside the 'import' command.
     The key is a module name. The value is always 'yes' (for those keys
     that are present in the table).
     So, it's again a hashed set.
     But here, you can not simply add a field to the GLModule class (it
     would be ugly design to add a field to GLModule that is only used
     by the GLImport class). So the best choice here is a 'dict' with
     GLModule keys and True/False values.

You will be amazed how pretty and simple this code becomes in Python,
compared to shell.

> If we just need to store the names of the already searched modules (or
> to store their GLModule instances, which is seems to be better), I
> suggest such implementation in the GLModuleSystem class:
> 
> 1. We add new attribute self.args['cache'] = list()
> 2. Each time when we run GLFileSystem.find(module) successfully, we
> add the new module to list. But before we check whether it was not
> searched already, and if yes, do nothing.

If you choose the 'dict' representation, then yes it will be a field of
the GLModuleSystem class. But beware: a dict! Not a list! Because lookup
in a list of 2000 elements is slow.

But in the cases of modcache_cached, conddep_isuncond, conddep_dependers
you can even get away without a 'dict', by use of a field in each GLModule
instance.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]