[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Questionable code in handling of wordend in the regexp engine in reg
From: |
Eli Zaretskii |
Subject: |
Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c |
Date: |
Mon, 04 Mar 2019 19:25:57 +0200 |
> Date: Sat, 2 Mar 2019 13:18:01 +0000
> Cc: address@hidden, address@hidden
> From: Alan Mackenzie <address@hidden>
>
> > eassert (NILP (BVAR (b, enable_multibyte_characters))
> > || bytepos >= BUF_Z_BYTE (b)
> > || CHAR_HEAD_P (BUF_FETCH_BYTE (b, bytepos)));
>
> > IOW, this test is irrelevant in unibyte buffers.
>
> Instead I moved the eassert to after the bit where it checks for unibyte
> buffers, giving this:
>
>
>
> diff --git a/src/marker.c b/src/marker.c
> index b58051a8c2..0b2e1bf5c6 100644
> --- a/src/marker.c
> +++ b/src/marker.c
> @@ -332,6 +332,10 @@ buf_bytepos_to_charpos (struct buffer *b, ptrdiff_t
> bytepos)
> if (best_above == best_above_byte)
> return bytepos;
>
> + /* Check bytepos is not in the middle of a character. */
> + eassert (bytepos >= BUF_Z_BYTE (b)
> + || CHAR_HEAD_P (BUF_FETCH_BYTE (b, bytepos)));
> +
> best_below = BEG;
> best_below_byte = BEG_BYTE;
>
>
> I now no longer see the failed easserts in make check.
>
> So I'll commit this sometime (real life is a bit urgent right now).
I was forced to disable this assertion for now: I bootstrapped today a
clean checkout, and several jobs that run during the bootstrap
triggered the assertion. It turns out there's one legitimate use case
when bytepos _can_ be in the middle of a multibyte sequence: when we
convert a buffer from unibyte to multibyte. There are comments to
that effect in set_intervals_multibyte_1.
I see 2 possible ways to handle this: (1) remove the assertion for
good, or (2) change buf_bytepos_to_charpos to accept one more
argument, telling it whether to make this check, and then modify all
the callers except those in set_intervals_multibyte_1 to pass 'true'
as that argument.
Thoughts?
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, (continued)
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Alan Mackenzie, 2019/03/01
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Eli Zaretskii, 2019/03/01
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Alan Mackenzie, 2019/03/01
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Alan Mackenzie, 2019/03/01
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Eli Zaretskii, 2019/03/01
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Alan Mackenzie, 2019/03/02
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Eli Zaretskii, 2019/03/02
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Eli Zaretskii, 2019/03/02
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Alan Mackenzie, 2019/03/02
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Eli Zaretskii, 2019/03/02
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c,
Eli Zaretskii <=
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Alan Mackenzie, 2019/03/05
- Re: Questionable code in handling of wordend in the regexp engine in regex-emacs.c, Eli Zaretskii, 2019/03/05