Re: [groff] Regularize (sub)section cross references.

From: Ingo Schwarze
Subject: Re: [groff] Regularize (sub)section cross references.
Date: Tue, 18 Dec 2018 01:13:20 +0100
Hi Tadziu,

Tadziu Hoffmann wrote on Mon, Dec 17, 2018 at 11:45:06PM +0100:
> Ingo:
>> Branden:

>>> I think it would be better to extend groff to expose the
>>> underlying locale-aware C case-transformation functions,
>>> and _not_ try maintaining our own mappings.

>> Indeed.  I don't think maintaining our own mappings is viable.
>> It just won't work, there are too many characters in Unicode.

> I'm against this whole locale thing.
> It needlessly complicates groff

That looks like an argument worth considering.
Adding a new request is certainly a notable complication.

Using towupper(3) in the source code, though, wouldn't appear to
introduce additional complication.  The file src/libs/libgroff/font.cpp
is already using wcwidth(3) from <wchar.h>.

> and will probably fail when reading foreign manual pages
> in a "C" locale.

Already now, that doesn't work very well.  Reading a french manual
with -Tascii strips all accents.  Reading a russian manual with
-Tascii strips all cyrillic letters, and along with them, almost
all information.

Changing case with towupper(3) wouldn't make that any worse.
Right now, you have capital e accent aigu in the title, which -Tascii
maps to an E without the accent right now.  With the proposed change,
you would have small e accent aigu, stripped of the accent by -Tascii,
mapped to E by towupper(3).

> I'd rather have the authors of foreign
> language manual pages simply add the conversion string
> for that language at the top of the document source.

For every single document?  That would likely result in many
conflicting versions, all with different bugs.  Also, writing manual
pages ought to be easy for authors, and designing such a conversion
string properly is not easy - it isn't that hard for a properly
organized pack of developers, but it is for a random individual
manual page author.  Even having to remeber that one must add it
is not easy.

But we are getting into details here...


