Re: GUILE 2/3 and string encoding cost

lilypond-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GUILE 2/3 and string encoding cost

From:	David Kastrup
Subject:	Re: GUILE 2/3 and string encoding cost
Date:	Wed, 22 Jan 2020 12:01:53 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Han-Wen Nienhuys <address@hidden> writes:

> I looked a bit through the GUILE source code to see what is going on.
>
> I believe our current hypothesis (LilyPond's slowdown is caused by
> expensive unicode transcoding into 32-bit strings) is incorrect.
>
> If you look into the source code, you can see that the UTF-8 -> SCM
> conversion checks if there are any code points over 255
>
>
> https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
>
> if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the
> string as a normal byte array. This code walks the string twice, but that
> is very cheap due to CPU cache locality, so it should be
> essentially equivalent to whatever GUILE 1.8 was doing.

GUILE 1.8 did not walk the string even once.

> LilyPond internally doesn't use any Unicode strings, as all our
> identifiers are pure ascii, as well as internal strings (eg. font
> glyph names). This means that files that do not use Unicode characters
> at all should have the same overhead for strings as GUILE 1.8.

We already use the latin1 calls for LilyPond internals.

> Even so, if the input flie does use UTF-8, there should be little
> overhead, because the number of texts that we process is always
> small. LilyPond is not a text processor.
>
> So, what hard data do we have on GUILE 2/3 slowness, and what does
> that data say?

That data says "humongous slowdown".  There is not much more than
speculation what this is caused by as far as I know.

-- 
David Kastrup

[Prev in Thread]

Current Thread

[Next in Thread]

GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/22
- Re: GUILE 2/3 and string encoding cost, David Kastrup <=
  - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
  - Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, David Kastrup, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Carl Sorensen, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Urs Liska, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Karlin High, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22

Prev by Date: Re: Context paths (and the Edition Engraver)
Next by Date: Re: GUILE 2/3 and string encoding cost
Previous by thread: GUILE 2/3 and string encoding cost
Next by thread: Re: GUILE 2/3 and string encoding cost
Index(es):
- Date
- Thread