[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-libunistring] case folding output size?
From: |
Bruno Haible |
Subject: |
Re: [bug-libunistring] case folding output size? |
Date: |
Wed, 28 Apr 2010 02:46:06 +0200 |
User-agent: |
KMail/1.9.9 |
Hi,
Aleksander Morgado wrote:
> Small questions regarding casefolding in UTF-8:
>
> — Function: uint8_t * u8_casefold (const uint8_t *s, size_t n, const
> char *iso639_language, uninorm_t nf, uint8_t *resultbuf, size_t
> *lengthp)
>
> What if the resultbuf passed doesn't have enough space for the
> case-folded and normalized string?
This is documented at the end of the doc section "Conventions":
<http://www.gnu.org/software/libunistring/manual/html_node/Conventions.html>
"Functions returning a string result take a (resultbuf, lengthp)
argument pair. If resultbuf is not NULL and the result fits into *lengthp
units, it is put in resultbuf, and resultbuf is returned. Otherwise, a
freshly allocated string is returned. In both cases, *lengthp is set to
the length (number of units) of the returned string. In case of error,
NULL is returned and errno is set."
> And, if NFC normalization desired in the output, would it be safe to say
> that the output length will be less or equal than the input length?
No, it is not. The file tests/test-u8-casefold.c has a couple of examples that
show a case-folded string can be longer than the original string.
In summary, these Unicode aware string manipulations have so complex details
that the classical assumptions all fail.
Bruno