Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043

lilypond-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043

From:	dak
Subject:	Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden)
Date:	Sat, 11 Apr 2020 04:15:45 -0700

On 2020/04/11 05:37:39, hanwenn wrote:
> >  In addition, I don't think that it is used to a degree where it
would
> significantly affect LilyPond's performance.
> 
> It is not yet. 
> 
> My plan is to plugin this into Grob and Prob and see if there is a
measurable
> speed improvement.

You cannot just "plug it in" since it doesn't integrate seamlessly into
the system of context-managed grob properties (override/revert).  There
is also lily/break-substitution.cc to deal with.

> If there is none, it's likely that your double indexing
> scheme will also not bring much.

I prefer to judge the viability of my approaches based on my own code,
but thanks for worrying.

>
https://codereview.appspot.com/559790043/diff/563830043/input/regression/scheme-unit-test.ly
> File input/regression/scheme-unit-test.ly (right):
> 
>
https://codereview.appspot.com/559790043/diff/563830043/input/regression/scheme-unit-test.ly#newcode6
> input/regression/scheme-unit-test.ly:6: #(ly:test-scheme-hash-table)
> On 2020/04/10 21:43:28, dak wrote:
> > This seems like a rather opaque way of testing functionality.
> 
> how about criticism that is constructive? What do you suggest instead?

Have the actual functionality tested spelled out in Scheme rather than
calling an opaque C++ block?  It's what we do elsewhere.
> 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/include/scm-hash.hh
> File lily/include/scm-hash.hh (right):
> 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/include/scm-hash.hh#newcode36
> lily/include/scm-hash.hh:36: Entry *table_;
> On 2020/04/10 21:43:28, dak wrote:
> > Any reason this is not a C++ array?  We don't use C style arrays
elsewhere.
> 
> so we can skip the marking for GUILEV2.

So how about making this a regular SCM vector?  That way you can also
skip the individual scm_gc_mark calls in Guilev1 and get everything
initialised to SCM_UNDEFINED.  Another possibility would be to use C++
STL code with a custom allocator.

>
https://codereview.appspot.com/559790043/diff/563830043/lily/scm-hash.cc
> File lily/scm-hash.cc (right):
> 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/scm-hash.cc#newcode59
> lily/scm-hash.cc:59: if (old_table[i].key != NULL)
> On 2020/04/10 21:43:29, dak wrote:
> > NULL is not a valid SCM value, nor is it guaranteed not to overlap
with valid
> > SCM values that may be stored here.  Better use SCM_UNDEFINED here. 
Also SCM
> > values should not be compared numerically but by using scm_is_eq . 
That
> allows
> > making sure that there are no integer/SCM confusion (there is a
#define one
> can
> > set for that purpose) which are a typical source for problems.
> 
> this would create overhead because scm_gc_calloc allocates data filled
with
> zeroes rather than SCM_xxx. I added some more comment about the 
assumptions to
> the header.

So?  SCM values should not be compared numerically.  If you want to rely
on 0 initialization (which seems rather overblown considering the
overall scheme of things) and not want to break the type system, you
could still do if (SCM_UNPACK (old_table[i].key) != 0) (note that SCM
values are not guaranteed to be equivalent to a pointer type so
comparison to NULL is also a bad idea).

> 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/scm-hash.cc#newcode73
> lily/scm-hash.cc:73: uintptr_t x = (uintptr_t) (key);
> On 2020/04/10 21:43:29, dak wrote:
> > For using the numerical value of an SCM, there is SCM_UNPACK. 
Please don't
> use
> > C style casts.  They always succeed, even in cases where they are a
very bad
> > idea.
> 
> Done.
> 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/scm-hash.cc#newcode81
> lily/scm-hash.cc:81: {
> On 2020/04/10 21:43:28, dak wrote:
> > This only works for specific cases (uintptr_t == 4 or 8).  Instead
of creating
> > our own implementation, how about submitting a patch to Guile
upstream?  That
> > way there is better review for Guile-specific problems and we don't
have the
> > maintenance cost in our own code.
> 
> no. we'd only get the benefits when we move to GUILE 3.2, whenever
that is
> released.

Guile development works differently.  Stuff that isn't written by Andy
Wingo is placed directly into the "stable" releases if it is considered
suitable.  Otherwise it is getting ignored.  At any rate, there is no
point in parallel development.
> 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/scm-hash.cc#newcode95
> lily/scm-hash.cc:95: if (table_[i].key != NULL)
> On 2020/04/10 21:43:28, dak wrote:
> > NULL is not a valid SCM value, and SCM values should be compared
using
> scm_is_eq
> > rather than !=.  Using SCM_UNDEFINED is the proper way of doing
things.
> 
> notice that we're not passing the NULL to GUILE anywhere.
> 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/scm-hash.cc#newcode111
> lily/scm-hash.cc:111: *idx = int (hash (key) % cap_);
> On 2020/04/10 21:43:29, dak wrote:
> > If the capacity is going to be one less than a power of 2, it would
appear to
> > make sense to use a power of 2 instead and use a mask rather than a
modulo
> > operation.  In particular since the hash functions look like they
are designed
> > in a manner where the individual bits are well-distributed.
> 
> no, that would constrain the table sizes, and therefore the space/time
tradeoff
> we make.

Huh?  The current implementation uses a constrained set of table sizes. 
I was just asking for the constraint to be larger by 1.
> 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/scm-hash.cc#newcode161
> lily/scm-hash.cc:161: if (j <= gap)
> On 2020/04/10 21:43:28, dak wrote:
> > This stuff is completely inscrutable and without any comment.  It
would appear
> > that it may depend on the capacity being one less than a power of 2
(see
> above)
> > but without any analysis and any comment or justification, that is
quite
> > unclear.
> 
> I added a wikipedia link. (This is standard undergrad CS stuff.)

We don't require a CS degree for being allowed to work on LilyPond code,
and you'll find that undergrad CS teaching material does include code
comments.
 
>
https://codereview.appspot.com/559790043/diff/563830043/lily/scm-hash.cc#newcode170
> lily/scm-hash.cc:170: while (next % cap_ != h);
> On 2020/04/10 21:43:29, dak wrote:
> > Fall-through without any assertion making sure that stuff worked
right?
> 
> do you have a suggestion about what to assert?

If we are not supposed to reach there in the first place, false?  Or put
out a programming error in the hope that things are not incurably
broken?

https://codereview.appspot.com/559790043/

[Prev in Thread]

Current Thread

[Next in Thread]

Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), dak, 2020/04/10
- Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), hanwenn, 2020/04/11
- Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), jonas . hahnfeld, 2020/04/11
  - Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), Han-Wen Nienhuys, 2020/04/11
- Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), dak <=
- Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), dak, 2020/04/11
- Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), hanwenn, 2020/04/12
- Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), jonas . hahnfeld, 2020/04/14
  - Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden), Han-Wen Nienhuys, 2020/04/14

Prev by Date: Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden)
Next by Date: Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden)
Previous by thread: Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden)
Next by thread: Re: Reimplement Scheme_hash_table using linear probing. (issue 559790043 by address@hidden)
Index(es):
- Date
- Thread