RE: unicode support in flex

help-flex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: unicode support in flex

From:	Hans Aberg
Subject:	RE: unicode support in flex
Date:	Thu, 24 Jan 2002 19:10:07 +0100

At 14:19 +0000 2002/01/24, Mark Weaver wrote:
>OK, I posted recently about having integrated the unicode patch into a
>re-entrant flex.
...
>1) Flex currently uses tables on the stack that are the size of the number
>of characters in the largest supported character set.  With the unicode
>patch, this goes up to 64K characters from 256, giving you a much larger
>stack requirement (it certainly exceeds 2Mb, and I'm guessing about 4Mb).
>This causes a default Win32 compile...
...
>3) As to if this is the correct method of supporting unicode, well possibly
>not.  It works fine for 16-bit unicode character sets, but wouldn't work for
>the (rarely used) 32-bit character set.  This isn't in common use however
>(cf xml.apache project which supports only UTF-16).

Which platform are you using? -- From discussions in comp.std.c++, it seems
that it is mainly backwards MS compilers that are using UTF-16. Linux was
said to use UTF-32, I recall.

Many important characters, including math and technical, are outside the
2^16 range. It will be too complicated (and time consuming) to use
variable-width characters internally in C/C++ programs; it is better to
convert to single width before internal processing. Then the next typical
alignment is UTF-32.

So for proper Unicode use in C/C++ programs, this points towards UTF-32.
Then some table compression techniques must be used.

  Hans Aberg

[Prev in Thread]

Current Thread

[Next in Thread]

unicode support in flex, Guillaume Morin, 2002/01/24
- Re: unicode support in flex, John W. Millaway, 2002/01/24
  - RE: unicode support in flex, Mark Weaver, 2002/01/24
    - RE: unicode support in flex, Hans Aberg <=

Prev by Date: yyparse more than once
Next by Date: Flex and C++
Previous by thread: RE: unicode support in flex
Next by thread: yyparse more than once
Index(es):
- Date
- Thread