bug-libunistring
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-libunistring] case folding output size?


From: Bruno Haible
Subject: Re: [bug-libunistring] case folding output size?
Date: Wed, 28 Apr 2010 02:46:06 +0200
User-agent: KMail/1.9.9

Hi,

Aleksander Morgado wrote:
> Small questions regarding casefolding in UTF-8:
> 
> — Function: uint8_t * u8_casefold (const uint8_t *s, size_t n, const
> char *iso639_language, uninorm_t nf, uint8_t *resultbuf, size_t
> *lengthp)
> 
> What if the resultbuf passed doesn't have enough space for the
> case-folded and normalized string?

This is documented at the end of the doc section "Conventions":
  <http://www.gnu.org/software/libunistring/manual/html_node/Conventions.html>
  "Functions returning a string result take a (resultbuf, lengthp)
   argument pair. If resultbuf is not NULL and the result fits into *lengthp
   units, it is put in resultbuf, and resultbuf is returned. Otherwise, a
   freshly allocated string is returned. In both cases, *lengthp is set to
   the length (number of units) of the returned string. In case of error,
   NULL is returned and errno is set."

> And, if NFC normalization desired in the output, would it be safe to say
> that the output length will be less or equal than the input length?

No, it is not. The file tests/test-u8-casefold.c has a couple of examples that
show a case-folded string can be longer than the original string.

In summary, these Unicode aware string manipulations have so complex details
that the classical assumptions all fail.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]