[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek
From: |
Eli Zaretskii |
Subject: |
bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek |
Date: |
Wed, 09 Dec 2020 17:46:10 +0200 |
> From: Mattias Engdegård <mattiase@acm.org>
> Date: Wed, 9 Dec 2020 15:37:19 +0100
> Cc: Lars Ingebrigtsen <larsi@gnus.org>, Aidan Kehoe <kehoea@parhasard.net>,
> 11309-done@debbugs.gnu.org
>
> ß is a lower case letter so lowercasep(ß)=false is wrong. As a consequence,
> matching ß with [:lower:] and [:upper:] don't work correctly: ß should be
> matched by [:lower:] when case-fold-search is nil, and by both [:lower:] and
> [:upper:] when case-fold-search is non-nil.
>
> The problem stems from the fact that uppercasep and lowercasep don't use the
> Unicode case information directly (which perhaps they should) but derive the
> case indirectly from the upcase and downcase tables, and there is no way to
> state that a char is lower case but cannot be upcased or downcased. (Below
> I'm going to use the notation T[C] for the table T indexed by character C.)
>
> Currently, characters missing from or self-mapping in the upcase and downcase
> tables are considered to be caseless. For instance, upcase[*]=downcase[*]=*
> and upcase[中]=downcase[中]=nil. However, we also have upcase[ß]=downcase[ß]=ß,
> causing the incorrect lowercasep result.
>
> The solution that I ended up applying was the simplest possible: set
> upcase[ß]=ẞ (U+7838). The special-uppercase properties ensure that (upcase
> "ß") => "SS", and now all tests pass.
>
> (An acceptable alternative would have been to set upcase[ß]=nil and adapt
> lowercasep accordingly. I tried that and it works flawlessly, but involves
> slightly more changes.)
>
> And that concludes the resolution of this bug.
Thanks.
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Mattias Engdegård, 2020/12/07
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Mattias Engdegård, 2020/12/08
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Eli Zaretskii, 2020/12/08
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Mattias Engdegård, 2020/12/08
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Eli Zaretskii, 2020/12/08
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Mattias Engdegård, 2020/12/09
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek,
Eli Zaretskii <=
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Mattias Engdegård, 2020/12/10
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Eli Zaretskii, 2020/12/10
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Mattias Engdegård, 2020/12/10
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Lars Ingebrigtsen, 2020/12/10
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Mattias Engdegård, 2020/12/11
- bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Lars Ingebrigtsen, 2020/12/11
bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Andreas Schwab, 2020/12/08
bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek, Basil L. Contovounesios, 2020/12/08