[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

libc upgrade vs. incompatible locales

From: Ludovic Courtès
Subject: libc upgrade vs. incompatible locales
Date: Sun, 30 Aug 2015 21:46:11 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

address@hidden (Ludovic Courtès) skribis:

> (The branch is called ‘wip-’ because the glibc upgrade happens to cause
> troubles: since it has new locale category elements, the locale data is
> incompatible with that older libcs expect, which means the bootstrap
> binaries fail with an assertion failure when trying to load the new
> locale data, like:
>   xz: loadlocale.c:130: _nl_intern_locale_data: Assertion `cnt < (sizeof 
> (_nl_value_type_LC_COLLATE) / sizeof (_nl_value_type_LC_COLLATE[0]))' failed.

I thought spelling out the details of why this is annoying might help
find a solution, so here we go.

The binary format for locales is dependent on the libc version.  Over
the last few releases, it turned out to be compatible, but that of 2.22
differs from that of 2.21 (a new element was added to locale categories,
according to ChangeLog.)

During bootstrapping, at some point we build ‘guile-final’ against the
latest libc (2.22.)  In gnu-build-system.scm we heavily use
‘regexp-exec’ (via ‘substitute*’), which calls C code, and thus uses

If we run in the “C” locale, we can only pass to ‘regexp-exec’ purely
ASCII strings.  However, it turns out that, occasionally, strings read
from files (in ‘patch-shebangs’ etc.) are not ASCII, but rather UTF-8
(see commit 87c8b92.)  Thus, calls to ‘regexp-exec’ with these strings
lead to a “failed to convert to locale encoding” error.

So ‘guile-final’ needs to run in a UTF-8 locale (the bootstrap Guile
doesn’t have that problem thanks to the hacky

However, it we set LOCPATH to point to the libc 2.22 locales, we satisfy
‘guile-final’, but we break all the bootstrap binaries, which were built
with an older libc; specifically, these binaries terminate with the
assertion failure above.  (If you’re still reading, I thank you for your

So we have some sort of an “interesting” checking-and-egg problem.

We could side-step the issue by using the pure-Scheme SRFI-105 instead
of ‘regexp-exec’.  That may work to some extent, but we cannot get rid
of ‘substitute*’ entirely overnight, so it’s not clear whether this
would be enough.

Apart from that, I can only think of dirty hacks.

What do people think?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]