groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] Regularize (sub)section cross references.


From: G. Branden Robinson
Subject: Re: [groff] Regularize (sub)section cross references.
Date: Mon, 17 Dec 2018 12:55:23 -0500
User-agent: NeoMutt/20180716

At 2018-12-18T04:42:36+1100, John Gardner wrote:
> > The biggest problem I know of is that the uppercasing transform of
> > German sharp S "ß" goes to "SS"
> 
> Pretty damn sure that's nothing compared to the Turkish dotless I
> <https://en.wikipedia.org/wiki/Dotted_and_dotless_I#In_computing>.
> 
> Then again, I'm sure they're used to seeing computers screw up the tittle
> by now... :-)

I'm aware of it.  :)  But I still regard it as a lesser problem because
at least it doesn't change the length of the string in glyphs or
codepoints.

(
Bytes?  In UTF-8, yup, it sure would:

U+0069 LATIN SMALL LETTER I
UTF-8: 69 UTF-16BE: 0069 Decimal: &#105; Octal: \0151
i (I)
Uppercase: 0049 [EXCEPT IN TURKISH -- GBR]
Category: Ll (Letter, Lowercase)
Unicode block: 0000..007F; Basic Latin
Bidi: L (Left-to-Right)

U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE
UTF-8: c4 b0 UTF-16BE: 0130 Decimal: &#304; Octal: \0460
İ (i)
Lowercase: 0069
Category: Lu (Letter, Uppercase)
Unicode block: 0100..017F; Latin Extended-A
Bidi: L (Left-to-Right)
Decomposition: 0049 0307
)

A lot of knowledge is embedded in tolower() and toupper() these days.
Back in the '70s and '80s they were just syntactic sugar for adding and
subtracting 32.

Life is more interesting now.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]