[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8 Flex

From: Hans Aberg
Subject: Re: UTF-8 Flex
Date: Tue, 11 Jan 2005 01:14:49 +0100
User-agent: Microsoft-Outlook-Express-Macintosh-Edition/5.0.6

Thinking a bit more about it, I think that in a UTF-8 mode, one may want to
parse UTF-8 input, but it may well be mixed with other data. So the design
might then be something like this.

Keep the \x.. and "." character range as it is. To this, add a \u........
for indicating Unicode numbers, as well as UTF-8 analogues of ".". These, as
well as extended character ranges should exclude prohibited UTF-8 ranges. In
addition, the lexer should admit one adds error messages for overloaded and
prohibited UTF-8 sequence. So some construct should be added to catch these.

  Hans Aberg

reply via email to

[Prev in Thread] Current Thread [Next in Thread]