[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: unicode support in flex
From: |
Hans Aberg |
Subject: |
RE: unicode support in flex |
Date: |
Thu, 24 Jan 2002 19:10:07 +0100 |
At 14:19 +0000 2002/01/24, Mark Weaver wrote:
>OK, I posted recently about having integrated the unicode patch into a
>re-entrant flex.
...
>1) Flex currently uses tables on the stack that are the size of the number
>of characters in the largest supported character set. With the unicode
>patch, this goes up to 64K characters from 256, giving you a much larger
>stack requirement (it certainly exceeds 2Mb, and I'm guessing about 4Mb).
>This causes a default Win32 compile...
...
>3) As to if this is the correct method of supporting unicode, well possibly
>not. It works fine for 16-bit unicode character sets, but wouldn't work for
>the (rarely used) 32-bit character set. This isn't in common use however
>(cf xml.apache project which supports only UTF-16).
Which platform are you using? -- From discussions in comp.std.c++, it seems
that it is mainly backwards MS compilers that are using UTF-16. Linux was
said to use UTF-32, I recall.
Many important characters, including math and technical, are outside the
2^16 range. It will be too complicated (and time consuming) to use
variable-width characters internally in C/C++ programs; it is better to
convert to single width before internal processing. Then the next typical
alignment is UTF-32.
So for proper Unicode use in C/C++ programs, this points towards UTF-32.
Then some table compression techniques must be used.
Hans Aberg