Re: GUILE 2/3 and string encoding cost

lilypond-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GUILE 2/3 and string encoding cost

From:	David Kastrup
Subject:	Re: GUILE 2/3 and string encoding cost
Date:	Fri, 24 Jan 2020 10:51:01 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Han-Wen Nienhuys <address@hidden> writes:

> On Thu, Jan 23, 2020 at 10:39 PM David Kastrup <address@hidden> wrote:
>
>>
>> > on mozart-hrn-3, over 3 runs, we get
>> >
>> > 2.0.14  - avg 2.1s
>> > 1.8.8 - avg 0.31s
>> >
>> > so the new GC is about 5-10x slower than the old one. With GUILE 1.8,
>> > garbage collection covers typically is 10% of the runtime, so all things
>> > equal, the Boehm GC would cause a 1.5-2.0x slowdown in the total.
>> >
>> > It would be good to see how the JITting of code impacts Scheme
>> > execution.
>>
>> Boehm GC can work in a background thread I think.  And Guile-v2
>> applications typically just let all their data be treated as pointers
>> rather than using a smob-marking algorithm like we do, and it is
>> conceivable that Boehm GC's individual mark function does not scale.
>>
>
> Do you mean our mechanism to call user-defined mark functions? I doubt
> that there are obvious BGC scalability problems in BGC's mark
> functoin.

We do everything through user-defined mark functions and BGC might not
be efficient in that regard.

>> However, considering everything a pointer for a 32bit application
>> that can eat a significant ratio of the total address space is a
>> nightmare: there would be just too much memory pinned down due to
>> conservative garbage collection.
>>
> GUILE 1.8 already scanned the stack conservatively, so large scores
> would probably never work on 32 bits.

We keep very little on the stack.  And GC is explicitly called between
command line files at a particularly low point in the stack.  We have a
warning in 1.8 for leftovers of types that are not expected at that
point of time, and while it triggers in random regtests, the frequency
is pretty low.

> Was this a concern in the past?  How do score sizes (in pages)
> translates to memory usage (in megabytes)?

Memory use for scores of several hundred pages can come close to eating
the 32bit address space.

> I think it is reasonable for us to start assuming people run lilypond
> on a 64-bit machines.

Sure.  That would make the assumptions of the BGC a better deal in
principle.

>> On a 64bit application, this would be somewhat more tenable, but we'd
>> need to override operator new for smobs.
>>
>> Or do we?  Maybe the heap is collected by default, and we need to switch
>> that off?
>>
>>
> What do you mean with "heap is collected"?

"Collected" is probably the wrong expression.  Sweeped and marked.  The
proposed behavior by Guile developers is not to bother with individual
mark hooks and just let the whole heap be marked and sweeped.

-- 
David Kastrup

[Prev in Thread]

Current Thread

[Next in Thread]

Re: GUILE 2/3 and string encoding cost, (continued)

Prev by Date: Re: GUILE 2/3 and string encoding cost
Next by Date: Re: GUILE 2/3 and string encoding cost
Previous by thread: Re: GUILE 2/3 and string encoding cost
Next by thread: Re: GUILE 2/3 and string encoding cost
Index(es):
- Date
- Thread