bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24914: 24.5; isearch-regexp: wrong error message


From: Noam Postavsky
Subject: bug#24914: 24.5; isearch-regexp: wrong error message
Date: Sat, 09 Dec 2017 21:18:05 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux)

Eli Zaretskii <address@hidden> writes:

>> I thought it would be easier to document the limit if it's fixed across
>> all machines.  Otherwise we would have to say something like "For both
>> forms, m and n, if specified, may be no larger than INT_MAX, which is
>> usually 2**31 - 1, but could be 2**63 - 1 depending on the compiler used
>> for building Emacs".
>
> Isn't int 32 bit wide everywhere?

I might have been mixing up int with long when I was thinking about
this; it seems only a few very obscure platforms have 64 bit ints.
According to [1], everywhere but "HAL Computer Systems port of Solaris
to the SPARC64" and "Classic UNICOS" has 32 bit ints.

[1]: https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models

> And anyway, since the bitmap is stored in an int, isn't INT_MAX TRT?

Unfortunately, all this discussion of int size seems to be academic.  I
took another look at the code, there is another limit due to regexp
opcode format.  We can raise the limit to 2^16-1 though.

Here is the use of RE_DUP_MAX, which makes it seem like int-size is the
main limit:

    /* Get the next unsigned number in the uncompiled pattern.  */
    #define GET_INTERVAL_COUNT(num)                                     \
      ...
                if (RE_DUP_MAX / 10 - (RE_DUP_MAX % 10 < c - '0') < num)        
\
                  FREE_STACK_RETURN (REG_ESIZEBR);                              
\


    static reg_errcode_t
    regex_compile (const_re_char *pattern, size_t size,
    {
      ...
                int lower_bound = 0, upper_bound = -1;
                [...]
                GET_INTERVAL_COUNT (lower_bound);

But then

                        INSERT_JUMP2 (succeed_n, laststart,
                                      b + 5 + nbytes,
                                      lower_bound);

    /* Like `STORE_JUMP2', but for inserting.  Assume `b' is the buffer end.  */
    #define INSERT_JUMP2(op, loc, to, arg) \
      insert_op2 (op, loc, (to) - (loc) - 3, arg, b)

    /* Like `insert_op1', but for two two-byte parameters ARG1 and ARG2.  */
                                      ^^^^^^^^
    static void
    insert_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2, 
unsigned char *end)
    {
      ...
      store_op2 (op, loc, arg1, arg2);
    }

    /* Like `store_op1', but for two two-byte parameters ARG1 and ARG2.  */
                                     ^^^^^^^^
    static void
    store_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2)
    {
      *loc = (unsigned char) op;
      STORE_NUMBER (loc + 1, arg1);
      STORE_NUMBER (loc + 3, arg2);
    }

    /* Store NUMBER in two contiguous bytes starting at DESTINATION.  */
                       ^^^^^^^^^^^^^^^^^^^^

    #define STORE_NUMBER(destination, number)                           \
      do {                                                                      
\
        (destination)[0] = (number) & 0377;                                     
\
        (destination)[1] = (number) >> 8;                                       
\
      } while (0)


Here is the updated patch:

Attachment: 0001-Raise-limit-of-regexp-repetition-Bug-24914.patch
Description: patch


reply via email to

[Prev in Thread] Current Thread [Next in Thread]