Re: UTF-8 Flex

From: Hans Aberg
Subject: Re: UTF-8 Flex
Date: Tue, 11 Jan 2005 01:14:49 +0100
Thinking a bit more about it, I think that in a UTF-8 mode, one may want to
parse UTF-8 input, but it may well be mixed with other data. So the design
might then be something like this.

Keep the \x.. and "." character range as it is. To this, add a \u........
for indicating Unicode numbers, as well as UTF-8 analogues of ".". These, as
well as extended character ranges should exclude prohibited UTF-8 ranges. In
addition, the lexer should admit one adds error messages for overloaded and
prohibited UTF-8 sequence. So some construct should be added to catch these.

  Hans Aberg

