[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort-lines including non ASCII

From: Eli Zaretskii
Subject: Re: sort-lines including non ASCII
Date: Thu, 07 Jul 2016 18:20:54 +0300

> From: Uwe Brauer <address@hidden>
> Date: Thu, 07 Jul 2016 07:41:03 +0000
>  > Because you are thinking Spanish, I presume.  Emacs by default is not
>    > sensitive to the current locale or language, when it compares strings,
>    > and instead does that in binary order of the characters' Unicode
>    > codepoints.  The advantage is that the order comes out the same in any
>    > locale.
> Hm I just made an experiment with Hebrew, with and without niqqud and
> indeed 

> בית
> אבא
> אוויר

> Is sorted correctly and also

> אוויר
> בית
> אַבָא

> So the niqqud does not influence the sorting but the accent in spanish
> does. Most likely Unicode is the culprit here, but it is contra
> intuitive.

Unicode has nothing to do with this.  The difference between אַ and Á
is that the former is always 2 characters, while the latter is usually
only one.  That's why sort-lines produces what looks like correct
results with Hebrew.  To see the problem there, you need to sort אבא
with אַבָא and אתבשא, for example.  Or something similar.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]