help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: unicode support in flex


From: Hans Aberg
Subject: RE: unicode support in flex
Date: Thu, 24 Jan 2002 19:10:07 +0100

At 14:19 +0000 2002/01/24, Mark Weaver wrote:
>OK, I posted recently about having integrated the unicode patch into a
>re-entrant flex.
...
>1) Flex currently uses tables on the stack that are the size of the number
>of characters in the largest supported character set.  With the unicode
>patch, this goes up to 64K characters from 256, giving you a much larger
>stack requirement (it certainly exceeds 2Mb, and I'm guessing about 4Mb).
>This causes a default Win32 compile...
...
>3) As to if this is the correct method of supporting unicode, well possibly
>not.  It works fine for 16-bit unicode character sets, but wouldn't work for
>the (rarely used) 32-bit character set.  This isn't in common use however
>(cf xml.apache project which supports only UTF-16).

Which platform are you using? -- From discussions in comp.std.c++, it seems
that it is mainly backwards MS compilers that are using UTF-16. Linux was
said to use UTF-32, I recall.

Many important characters, including math and technical, are outside the
2^16 range. It will be too complicated (and time consuming) to use
variable-width characters internally in C/C++ programs; it is better to
convert to single width before internal processing. Then the next typical
alignment is UTF-32.

So for proper Unicode use in C/C++ programs, this points towards UTF-32.
Then some table compression techniques must be used.

  Hans Aberg





reply via email to

[Prev in Thread] Current Thread [Next in Thread]