[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/3] yacc: compute the best type for the state number

From: Paul Eggert
Subject: Re: [PATCH 0/3] yacc: compute the best type for the state number
Date: Tue, 1 Oct 2019 15:43:43 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0

On 10/1/19 11:40 AM, Kaz Kylheku wrote:
That said, parser states are more of an enumeration. They are identifiers.
We shouldn't be doing any math on them of this sort.

That may be true in theory, but not in practice. For example, the GLR skeleton's yyLRgotoState has this line:

  int yyr = yypgoto[yysym - YYNTOKENS] + yystate;

which is doing arithmetic on state numbers. If we use signed arithmetic for this sort of thing, we can catch state-number arithmetic overflow automatically at runtime with GCC. Currently we can't so easily catch such errors.

The unsigned types provide a greater range for the possible states,
without having to use negative values, so they are understandably an
attractive tool for combating the problem of running out of states.

That is a practical issue for narrower-than-int types, so we may want to continue to use unsigned types in large arrays of narrower-than-int integers. This is reasonably safe since these integers promote to int and so avoid most of the unsigned-arithmetic problems. It could be the subject of another patch (a relatively-small one, I think).

However, we should avoid unsigned types that are 'unsigned' or wider, as they have too many issues. I doubt whether there are practical uses of Bison with more than INT_MAX states; but if there are, we should use ptrdiff_t to count states, not int, because any application likely to exceed INT_MAX is also pretty likely to exceed UINT_MAX.

There is an area where Bison uses 'int' when it should use a wider type, presumably intmax_t. This is for line and column numbers, which these come from input files and can exceed INT_MAX in some practical cases. Fixing that could be a subject of a different patch.

There are no doubt other uses of 'int' in Bison for indexes should be changed to ptrdiff_t, for things like stack depth where Bison should not impose arbitrary limits. I think I should take a look at that next; this will probably entail improvements to the patch I proposed.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]