gnulib-tool-py
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnulib-tool-py] The root of the problem


From: Bruno Haible
Subject: Re: [gnulib-tool-py] The root of the problem
Date: Tue, 26 Jun 2012 03:23:52 +0200
User-agent: KMail/4.7.4 (Linux/3.1.10-1.9-desktop; KDE/4.7.4; x86_64; ; )

Hi Dmitriy,

[CCing the mailing list because this is a generally important topic.]

> after researching this code at the 9th or 10th time, I think I've found the
> root of the problem.
> The root of the problem is that I can't understand why we need to have
> modcache variable.

A cache is a kind of of optimization. Caches and, more generally,
optimizations can introduce subtle bugs. It is good software design
to make it possible to turn off such optimizations. Experience shows
that sometimes there are bugs in the optimization code, and even more
often the optimization has unintended side effects. For caches in
particular, it is a common mistake to forget about a dependency of
the cache.

I once had a bug in some code with 3 caches. With all the complicated
cache code enabled, debugging this drove me crazy, because when a value
in one of the caches was changed, it notified the other 2 caches that
they should change as well. And in this huge code of conditional
recomputations and notifications there was a bug. But when I turned off
one or more of these 3 caches, I could quickly see that the bug was not
present any more; so this gave me a clue where to look for the bug.

> We create GLImport instance (or other instance which has to work with
> GLModule instances).
> When we create new module instance, why we should cache it only if modcache
> is True?
> Why we can't just create cache always?

Because the simpler, dumber code without cache is easier to debug.

> 0. Add new attribute for GLModuleSystem:
>   self.cache, which is used to store dict of the added modules and their
> dependencies.
> 
> 1. Add new method to GLModule:
>   self.getDependencies():
>     Get dependencies of the module. BEWARE: it returns just strings, not
> GLModule instances,
>     because we do not want to create GLModule for each module every time!
> [see below]
>     >>> GLModule('module').getDependencies() # Just an example
>     [GLModule('dep1'), GLModule('dep2')]
> 
> 2. Add new method to GLModuleSystem:
>   self.addModule(module):
>     1). Run GLModuleSystem.find(module) method and get GLModule instance:
>       >>> module = GLModuleSystem.find('module')
>     2). Run GLModule.getDependencies() to get list of the dependencies.
>       >>> deps = moduleobj.getDependencies()

Sounds all right.

>     3) Append this all to self.cache dictionary as you can see here:
>       >>> result = {module: ['dep1', 'dep2']} # see [2] to understand why
> list of strings
>       >>> self.cache.update(result)

When the option --no-cache-modules is in effect, you just skip this step.

> 3. Add new method to GLModuleSystem:
>   removeModule(module):
>     1). Run GLModuleSystem.find(module) method and get GLModule instance:
>       >>> module = GLModuleSystem.find('module')
>     2). Remove module if it is in keys of self.cache.
>       >>> if module in self.cache:
>       >>>   self.cache.pop(module)
>       >>> else:
>       >>>   raise(KeyError('no such module'))

"Remove module"? I'm not aware of any part of gnulib-tool that needs this.

> 4. Add new method to GLModuleSystem:
>   getModules():
>     >>> result = [module for module in self.cache]
>     >>> return(result)

The old gnulib-tool does not have a function that works like this.
Who would like to use this method? Remember that self.cache if just
an optimization; if the optimization is turned off your proposed
getModules() would return an empty list.

> 5. Add new method to GLModuleSystem:
>   getDependencies():
>     >>> result = list()
>     >>> for module in self.cache:
>     >>>   if self.cache[module] not in result:
>     >>>     result += [self.cache[module]] # Here we get list of strings!!
> [see [1]]
>     >>> result = [self.find(module) for module in result]
>     >>> return(result)
>   As you see it has check if dependency was already used by another module.
>   Also you can understand why we do not return GLModule instances by running
>   GLModule.getDependencies().

It looks like you try to exploit the cache in too many places.
For me, there is one single place where to use the cache: in
GLModuleSystem.find(name). All other code should call GLModuleSystem.find
when it wants to convert a module name to a GLModule object.

For getDependencies, it is possible that you need to have two methods:
one that returns a list of strings, and one which returns a list of
GLModule objects.

> 6. Add new method to GLModuleSystem:
>   getAllModules():
>     >>> result = self.getModules() + self.getDependencies()
>     >>> return(result)

This implementation of getAllModule() would not return all modules,
only those that have been looked for so far.

func_all_modules needs to go over the file system and search file files
and directories under modules/.

> What do you think about such kind of system? Of course, it assumes that
> modcache is always True

It is better to design things in such a way that
  1. the cache can be turned off,
  2. there is only one place (or at least very few places) where you need
     to test the boolean value of modcache.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]