[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

strstr, strcase, strcasestr, and i18n

From: Bruno Haible
Subject: strstr, strcase, strcasestr, and i18n
Date: Fri, 2 Feb 2007 05:17:22 +0100
User-agent: KMail/1.5.4

I wrote:
> I think it's time for me to report a glibc bug on strstr and strcasestr, 
> then...

Paul Eggert wrote:
> But now that you mention it, why is there a c-strstr module, or a
> fancy strstr replacement that looks at multibyte characters?

The situation is indeed a bit messy.

Since <ctype.h>, strtod, strtold are locale dependent, but sometimes
one needs the locale independent functionality, so we added c-ctype,
c-strtod, c-strtold.

It thought this could be extended to more str* functions easily, but the
situation is not so easy. The problematic modules are:

  - strstr: This function's behaviour is not clearly defined. POSIX says
    that it compares a "string" with a "sequence of bytes". Which a priori
    is nonsense, since the elements of strings are characters.

  - strcase (strcasecmp, strncasecmp): Here POSIX talks about two strings,
    but doesn't mention LC_CTYPE explicitly. Rather it says the results are
    "unspecified" in real locales. Also strncasecmp does not make sense for
    multibyte locales.

  - strcasestr: This function is not specified by POSIX. All known legacy
    implementations do not care about multibyte locales.

It was tempting to make a clear API nomenclature: c-str* for the C locale
emulation, str* for the internationalized functions. But if you're right
with strstr, then we should find new names for the internationalized versions
of these functions.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]