emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: case-insensitive string comparison


From: Bruno Haible
Subject: Re: case-insensitive string comparison
Date: Mon, 25 Jul 2022 21:37:16 +0200

Sam Steingold asked:
> > (string-collate-equalp "a" "A" current-locale-environment t)
> > ==> nil
> > current-locale-environment
> > ==> "en_US.UTF-8"
> 
> So, how do we do case-insensitive string comparison in Emacs?
> 
> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
> (even though it does not recognize "SS" and "ß" as equal)
> 
> Or should we first implement something like casefold in Python?
> https://docs.python.org/3/library/stdtypes.html#str.casefold

The Unicode Standard's algorithm for case-insensitive string comparison
is indeed much better thought-out than anything that you could come
up with within a month.

You are pointing to the Python implementation. But there's also an
implementation in GNU libunistring [1] and one in ICU4C <unicode/ustring.h>
[2]. Emacs could surely use one of these.

The implementation from GNU libunistring is also available through Gnulib,
as a set of modules [3]. The most relevant modules are
  unicase/u8-casecmp
  unicase/u8-casecoll
  unicase/u8-casefold
  unicase/u8-casemap
  unicase/u8-casexfrm
  unicase/u8-ct-casefold
  unicase/u8-ct-tolower
  unicase/u8-ct-totitle
  unicase/u8-ct-toupper

Bruno

[1] 
https://www.gnu.org/software/libunistring/manual/html_node/Case-insensitive-comparison.html
[2] https://unicode-org.github.io/icu/userguide/transforms/casemappings.html
[3] https://www.gnu.org/software/gnulib/MODULES.html






reply via email to

[Prev in Thread] Current Thread [Next in Thread]