[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gnulib-tool.py: lists vs. sets
From: |
Bruno Haible |
Subject: |
Re: gnulib-tool.py: lists vs. sets |
Date: |
Sat, 06 Apr 2024 15:41:43 +0200 |
Hi Collin,
> I was looking at simplifying some list comprehensions yesterday and
> noticed that using sets + removing unnecessary sorted() calls in
> GLModuleTable reduces the time of ./import-tests/test-all.sh from 4.8
> seconds to 4.5 seconds consistently. A truly life changing amount of
> time.
>
> You can probably notice this by looking in
> GLModuleTable.transitive_closure and
> GLModuleTable.transitive_closure_separately. Most of the time we are
> checking if a module is in a list, O(N). It makes more sense to use a
> set, O(1).
>
> The only issue with that is it becomes annoying to deal with the dummy
> module. Is there a reason why that one doesn't get sorted?
Two comments:
* gnulib-tool.py is between 2x and 300x faster than gnulib-tool.sh, depending
on the arguments. This means, the goal of speedup has been fully achieved.
The other goals (clear code, maintainability, etc.) are still present.
This means, if you have a 7% speed improvement and it does not impede
the other goals, nor changes the results, then apply it. But when you start
seeing that something else is in its way, then leave it.
* Lists have a lookup time of O(N) on average, sets of O(1) or O(log N),
depending on implementation. But lists preserve the order.
A common trick to have both advantages — in the case that there are no
removals — is to keep a list _and_ a set in two variables (or two data
fields of the same class). Do the insertions on both. Do the lookups on
the set.
This technique can also be generalized to support removals, but that is
a bit more tricky.
Bruno
- gnulib-tool.py: Fix 'consider-using-with' pylint warnings., Collin Funk, 2024/04/05
- gnulib-tool.py: Use 'Any' instead of type unions in GLConfig., Collin Funk, 2024/04/05
- Re: gnulib-tool.py: Fix 'consider-using-with' pylint warnings., Bruno Haible, 2024/04/05
- Re: gnulib-tool.py: Fix 'consider-using-with' pylint warnings., Collin Funk, 2024/04/05
- Re: gnulib-tool.py: lists vs. sets,
Bruno Haible <=
- Re: gnulib-tool.py: lists vs. sets, Collin Funk, 2024/04/06
- Re: gnulib-tool.py: lists vs. sets, Bruno Haible, 2024/04/07
- Re: gnulib-tool.py: lists vs. sets, Collin Funk, 2024/04/07
- Re: Python list micro-benchmarks, Bruno Haible, 2024/04/07
- Re: Python list micro-benchmarks, Collin Funk, 2024/04/07
- Re: Python list micro-benchmarks, Paul Eggert, 2024/04/07
- Re: Python list micro-benchmarks, Collin Funk, 2024/04/07
- Re: Python list micro-benchmarks, Bruno Haible, 2024/04/08
- Re: Python list micro-benchmarks, Collin Funk, 2024/04/08
- Re: Python list micro-benchmarks, Bruno Haible, 2024/04/08