[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sort-lines including non ASCII
From: |
Eli Zaretskii |
Subject: |
Re: sort-lines including non ASCII |
Date: |
Thu, 07 Jul 2016 18:20:54 +0300 |
> From: Uwe Brauer <address@hidden>
> Date: Thu, 07 Jul 2016 07:41:03 +0000
>
> > Because you are thinking Spanish, I presume. Emacs by default is not
> > sensitive to the current locale or language, when it compares strings,
> > and instead does that in binary order of the characters' Unicode
> > codepoints. The advantage is that the order comes out the same in any
> > locale.
>
> Hm I just made an experiment with Hebrew, with and without niqqud and
> indeed
> בית
> אבא
> אוויר
> Is sorted correctly and also
> אוויר
> בית
> אַבָא
> So the niqqud does not influence the sorting but the accent in spanish
> does. Most likely Unicode is the culprit here, but it is contra
> intuitive.
Unicode has nothing to do with this. The difference between אַ and Á
is that the former is always 2 characters, while the latter is usually
only one. That's why sort-lines produces what looks like correct
results with Hebrew. To see the problem there, you need to sort אבא
with אַבָא and אתבשא, for example. Or something similar.
- Re: sort-lines including non ASCII, (continued)
- Re: sort-lines including non ASCII, Michael Heerdegen, 2016/07/14
- Re: sort-lines including non ASCII, Clément Pit--Claudel, 2016/07/14
- Re: sort-lines including non ASCII, Noam Postavsky, 2016/07/14
- Re: sort-lines including non ASCII, Michael Heerdegen, 2016/07/14
- Re: sort-lines including non ASCII, Noam Postavsky, 2016/07/14
- Re: sort-lines including non ASCII, Richard Stallman, 2016/07/08
- Re: sort-lines including non ASCII, Michael Heerdegen, 2016/07/08
- Re: sort-lines including non ASCII, Richard Stallman, 2016/07/09
- Re: sort-lines including non ASCII, John Wiegley, 2016/07/12
Re: sort-lines including non ASCII, Uwe Brauer, 2016/07/07
Re: sort-lines including non ASCII, Teemu Likonen, 2016/07/07