emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questionable code in handling of wordend in the regexp engine in reg


From: Eli Zaretskii
Subject: Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c
Date: Mon, 04 Mar 2019 19:25:57 +0200

> Date: Sat, 2 Mar 2019 13:18:01 +0000
> Cc: address@hidden, address@hidden
> From: Alan Mackenzie <address@hidden>
> 
> >   eassert (NILP (BVAR (b, enable_multibyte_characters))
> >            || bytepos >= BUF_Z_BYTE (b)
> >        || CHAR_HEAD_P (BUF_FETCH_BYTE (b, bytepos)));
> 
> > IOW, this test is irrelevant in unibyte buffers.
> 
> Instead I moved the eassert to after the bit where it checks for unibyte
> buffers, giving this:
> 
> 
> 
> diff --git a/src/marker.c b/src/marker.c
> index b58051a8c2..0b2e1bf5c6 100644
> --- a/src/marker.c
> +++ b/src/marker.c
> @@ -332,6 +332,10 @@ buf_bytepos_to_charpos (struct buffer *b, ptrdiff_t 
> bytepos)
>    if (best_above == best_above_byte)
>      return bytepos;
>  
> +  /* Check bytepos is not in the middle of a character. */
> +  eassert (bytepos >= BUF_Z_BYTE (b)
> +           || CHAR_HEAD_P (BUF_FETCH_BYTE (b, bytepos)));
> +
>    best_below = BEG;
>    best_below_byte = BEG_BYTE;
>  
> 
> I now no longer see the failed easserts in make check.
> 
> So I'll commit this sometime (real life is a bit urgent right now).

I was forced to disable this assertion for now: I bootstrapped today a
clean checkout, and several jobs that run during the bootstrap
triggered the assertion.  It turns out there's one legitimate use case
when bytepos _can_ be in the middle of a multibyte sequence: when we
convert a buffer from unibyte to multibyte.  There are comments to
that effect in set_intervals_multibyte_1.

I see 2 possible ways to handle this: (1) remove the assertion for
good, or (2) change buf_bytepos_to_charpos to accept one more
argument, telling it whether to make this check, and then modify all
the callers except those in set_intervals_multibyte_1 to pass 'true'
as that argument.

Thoughts?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]