[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Character error not reported

From: Hans Åberg
Subject: Re: Character error not reported
Date: Tue, 18 Jun 2019 18:09:08 +0200

> On 17 Jun 2019, at 18:06, Akim Demaille <address@hidden> wrote:
> Hi Hans,


>> Le 17 juin 2019 à 15:12, Hans Åberg <address@hidden> a écrit :
>> When a byte with high bit set that is not used in the grammar, the parser 
>> generated by Bison 3.4.1, does not report an error, only if the high bit is 
>> not set.
> This is hard to believe.  I suspect your problem is elsewhere.
>> This occurs if one sets a Flex default rule
>> . { return yytext[0]; }
>> and the lexer finds a stray UTF-8 byte.
> I would say that here, you return a char (yytext[0]) with "a high bit set", 
> on an architecture where char is signed, so you are actually returning a 
> negative int (when the 8th bit is set).  And for Bison, any negative token 
> number stands for end-of-file.

Indeed, likely the case.

> You should actually write:
> . { return (unsigned char) yytext[0]; }

As 8-bit character tokens are not useful with UTF-8, I have replaced it with:
  %token token_error "token error"

. { return my_parser::token::token_error; }

Please let me know if there is a better way to generate a parser error.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]