[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.
From: |
John Kearney |
Subject: |
Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly. |
Date: |
Tue, 21 Feb 2012 14:43:20 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:10.0) Gecko/20120129 Thunderbird/10.0 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 02/21/2012 01:34 PM, Eric Blake wrote:
> On 02/20/2012 07:42 PM, Chet Ramey wrote:
>> On 2/18/12 5:39 AM, John Kearney wrote:
>>
>>> Bash Version: 4.2 Patch Level: 10 Release Status: release
>>>
>>> Description: Current u32toutf8 only encode values below 0xffff
>>> correctly. wchar_t can be ambiguous size better in my opinion
>>> to use unsigned long, or uint32_t, or something clearer.
>>
>> Thanks for the patch. It's good to have a complete
>> implementation, though as a practical matter you won't see UTF-8
>> characters longer than four bytes. I agree with you about the
>> unsigned 32-bit int type; wchar_t is signed, even if it's 32
>> bits, on several systems I use.
>
> Not only can wchar_t can be either signed or unsigned, you also
> have to worry about platforms where it is only 16 bits, such as
> cygwin; on the other hand, wint_t is always 32 bits, but you still
> have the issue that it can be either signed or unsigned.
>
signed / unsigend isn't really the problem anyway utf-8 only encodes
up to 0x7fff ffff and utf-16 only encodes up to 0x0010 ffff.
In my latest version I've pretty much removed all reference to wchar_t
in unicode.c. It was unnecessary.
However I would be interested in something like utf16_t or uint16_t
currently using unsigned short which is intelligent but works.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJPQ593AAoJEKUDtR0WmS05g0wH/RPQMl1mfUdJBfzv5QkUtVSG
ibezTe3/b7/9h8SG3LLrv2FiPS+FtcCbE4n8tUror3V1BHomsQHZdlj/Zshi8W/n
YDl5ac5nc0rrOlw+SJxyCAJl9vHeEAXavjGw8m0KUv/vn0tZyWNM0RYXc7tRxJU2
uqY7G5sGLUt8uGuswCmSmucKjoB7guiUbsmTR+OzgDgKxuuSeQBr6/oIImo721pk
nI5TYdqerPGCIMJoYPeZChCBAZ/WhK9i3C3/SxKme4zWnjySaDw3NH0yfqFHl4Ts
IIOT4fYpm0h62U76+NJSPGWfadTd8UL4A/Jy4I3IwUS+mflwdU0Pu2zmwb8I+Xk=
=pkAF
-----END PGP SIGNATURE-----
- Fix u32toutf8 so it encodes values > 0xFFFF correctly., John Kearney, 2012/02/18
- Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly., Chet Ramey, 2012/02/20
- Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly., Eric Blake, 2012/02/21
- Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.,
John Kearney <=
- Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly., Chet Ramey, 2012/02/21
- Initial test code for \U, John Kearney, 2012/02/21
- Here is a diff of all the changed to the unicode, John Kearney, 2012/02/21
- Re: Initial test code for \U, Chet Ramey, 2012/02/22
- Re: Initial test code for \U, Eric Blake, 2012/02/22
- Re: Initial test code for \U, John Kearney, 2012/02/26
- Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly., Linda Walsh, 2012/02/22
- Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly., Eric Blake, 2012/02/22
- Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly., John Kearney, 2012/02/22
- Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly., Linda Walsh, 2012/02/22