[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#61269: 28.2; Sequence of spaces preceding tab in bidirectional line
From: |
Eli Zaretskii |
Subject: |
bug#61269: 28.2; Sequence of spaces preceding tab in bidirectional line |
Date: |
Sat, 04 Feb 2023 13:38:20 +0200 |
> From: Halim <mhalimln@outlook.com>
> Date: Sat, 04 Feb 2023 02:41:35 +0700
>
>
> In a left-to-right line emacs display a sequence of one or more
> spaces (U+0020), where the spaces precede a tab (U+0009) and they
> both appear between two right-to-left alphabet, to the left of the
> first (in typing order) rtl alphabet.
>
> The bug does not present when the rtl text is inside an rtl
> isolate.
>
> Let s represent space, t represet tab, l represent itself, r and
> m represent arabic alphabet. The following example have this format
> in typing order from left to right.
>
> Format:
> lsrssstm
>
> Example text:
> l ح م
>
> The expected display is 'lsrssstm', the actual is 'lssssrtm'.
> The spaces following 'r' in the format is displayed to the left
> of 'r' in the actual display. Using 'C-f' from 'r' moves the
> cursor to the left until it hits 't' where the cursor move to
> the right of 'r'.
>
> I have tried to view the file containing the buggy text in
> focuswriter and fribidi. They both display the same expected
> way.
>
> Extra Info
>
> The bug also present to ltr text on rtl line. I believe
> this is generic and is caused by this line
> '&& level != bidi_it->level_stack[0].level' (see below).
>
> The bug also present in emacs built from commit
> 'ac7ec87a7a0db887e4ae7fe9005aea517958b778' with
> --without-all. In this commit I make the following
> modification.
>
> ---------------
> $ git diff ac7ec87a7a0db887e4ae7fe9005aea517958b778
> diff --git a/src/bidi.c b/src/bidi.c
> index e012512..fe6e4d6 100644
> --- a/src/bidi.c
> +++ b/src/bidi.c
> @@ -3302,10 +3302,7 @@ bidi_level_of_next_char (struct bidi_it *bidi_it)
> if ((bidi_it->orig_type == NEUTRAL_WS
> || bidi_it->orig_type == WEAK_BN
> || bidi_isolate_fmt_char (bidi_it->orig_type))
> - && bidi_it->next_for_ws.charpos < bidi_it->charpos
> - /* If this character is already at base level, we don't need to
> - reset it, so avoid the potentially costly loop below. */
> - && level != bidi_it->level_stack[0].level)
> + && bidi_it->next_for_ws.charpos < bidi_it->charpos)
> {
> int ch;
> ptrdiff_t clen = bidi_it->ch_len;
> ---------------
>
> It fixes the bug.
Thanks.
You are right that the logic there was flawed. However, just removing
the base-level test is sub-optimal: that test was added to speed up
redisplay when the buffer has a lot of control characters (e.g.,
binary null bytes) that don't need to be reordered; see bug#22739.
So I have installed a slightly different change, reproduced below;
please see that it solves the problem, including (presumably) some
real-life problems you had in displaying RTL text with embedded TABs.
diff --git a/src/bidi.c b/src/bidi.c
index e012512..93875d2 100644
--- a/src/bidi.c
+++ b/src/bidi.c
@@ -3300,12 +3300,15 @@ bidi_level_of_next_char (struct bidi_it *bidi_it)
it belongs to a sequence of WS characters preceding a newline
or a TAB or a paragraph separator. */
if ((bidi_it->orig_type == NEUTRAL_WS
- || bidi_it->orig_type == WEAK_BN
+ || (bidi_it->orig_type == WEAK_BN
+ /* If this BN character is already at base level, we don't
+ need to consider resetting it, since I1 and I2 below
+ will not change the level, so avoid the potentially
+ costly loop below. */
+ && level != bidi_it->level_stack[0].level)
|| bidi_isolate_fmt_char (bidi_it->orig_type))
- && bidi_it->next_for_ws.charpos < bidi_it->charpos
- /* If this character is already at base level, we don't need to
- reset it, so avoid the potentially costly loop below. */
- && level != bidi_it->level_stack[0].level)
+ /* This means the informaition about WS resolution is not valid. */
+ && bidi_it->next_for_ws.charpos < bidi_it->charpos)
{
int ch;
ptrdiff_t clen = bidi_it->ch_len;
@@ -3340,7 +3343,7 @@ bidi_level_of_next_char (struct bidi_it *bidi_it)
|| bidi_it->orig_type == NEUTRAL_S
|| bidi_it->ch == '\n' || bidi_it->ch == BIDI_EOB
|| ((bidi_it->orig_type == NEUTRAL_WS
- || bidi_it->orig_type == WEAK_BN
+ || bidi_it->orig_type == WEAK_BN /* L1/Retaining */
|| bidi_isolate_fmt_char (bidi_it->orig_type)
|| bidi_explicit_dir_char (bidi_it->ch))
&& (bidi_it->next_for_ws.type == NEUTRAL_B