[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GUILE 2/3 and string encoding cost
From: |
Carl Sorensen |
Subject: |
Re: GUILE 2/3 and string encoding cost |
Date: |
Wed, 22 Jan 2020 20:28:41 +0000 |
User-agent: |
Microsoft-MacOutlook/10.10.10.191111 |
On 1/22/20, 1:21 PM, "lilypond-devel on behalf of David Kastrup"
<lilypond-devel-bounces+c_sorensen=address@hidden on behalf of address@hidden>
wrote:
Han-Wen Nienhuys <address@hidden> writes:
> On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <address@hidden> wrote:
>
>> Han-Wen Nienhuys <address@hidden> writes:
>>
>> > I looked a bit through the GUILE source code to see what is going on.
>> >
>> > I believe our current hypothesis (LilyPond's slowdown is caused by
>> > expensive unicode transcoding into 32-bit strings) is incorrect.
>> >
>> > If you look into the source code, you can see that the UTF-8 -> SCM
>> > conversion checks if there are any code points over 255
>> >
>> >
>> >
>>
https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
>> >
>> > if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the
>> > string as a normal byte array. This code walks the string twice, but
that
>> > is very cheap due to CPU cache locality, so it should be
>> > essentially equivalent to whatever GUILE 1.8 was doing.
>>
>> GUILE 1.8 did not walk the string even once
>>
>
> GUILE 1.8 walks it once when you do memcpy.
Ok, but that's sort of a cheap walk.
>> > Even so, if the input flie does use UTF-8, there should be little
>> > overhead, because the number of texts that we process is always
>> > small. LilyPond is not a text processor.
>> >
>> > So, what hard data do we have on GUILE 2/3 slowness, and what does
>> > that data say?
>>
>> That data says "humongous slowdown". There is not much more than
>> speculation what this is caused by as far as I know.
>>
>>
> Do we have a standardized test file for benchmarking performance?
input/regression/mozart-hrn-3.ly possibly, but it's not particularly
large.
We don't have a standardized test file, but we do have some representative
results from a couple of (unknown but described) files:
https://lists.gnu.org/archive/html/lilypond-devel/2018-10/msg00054.html
Perhaps we could get those files to become standards (along with some other,
shorter-compiling files).
Carl
Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/22
Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/23