[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bidi-display-reordering is now non-nil by default

From: Eli Zaretskii
Subject: Re: bidi-display-reordering is now non-nil by default
Date: Fri, 05 Aug 2011 09:40:41 +0300

> From: "Stephen J. Turnbull" <address@hidden>
> Cc: address@hidden,
>     address@hidden,
>     address@hidden
> Date: Fri, 05 Aug 2011 12:38:21 +0900
> Eli Zaretskii writes:
>  > They are not irrelevant.  What you suggest runs the risk of adding or
>  > removing LRM/RLM characters to/from a file against user
>  > expectations.
> Sure, but byte-level equality is not part of that; character-level
> equality is.

LRM is also a character, for this purpose, yes?

>  > Again, what if the user inserts another LRM?
> Insert another non-character "marker" in the buffer, using whichever
> non-character strategy were using.

And now what happens if the user wants to search for that LRM
character she just inserted?

>  > > What's wrong with reparsing the buffer from the beginning, treating
>  > > each change of value of the direction property as insertion of the
>  > > appropriate direction mark?
>  > 
>  > Reparsing the whole buffer upon each insertion?  Is that the way to
>  > make redisplay fast and efficient?
> No, that's a proof that it's *possible*, where your words claim it's
> *im*possible.

Impossible or unacceptable -- is there really a difference in

> Making it fast is a SMOP.  You say it's beyond you, and
> that probably means it's beyond anybody competent enough in bidi to do
> the implementation.  But let's not discourage anyone from trying. ;-)

There's a saying that smart people learn from their experience, but
wise people learn from that of others.  If someone is wise and wants
to learn from my experience, please read the history and the diffs of
bug#9218.  It had to do with a flawed design of certain aspects of
bidi iteration whereby sometimes the display engine had to look from
some point in the buffer to its very end.  The result was a completely
unusable Emacs in the buffers that were hitting this design flaw
(e.g., Org Mode buffers of a few MB size).

>  > How do you indicate them, exactly?  Emacs has no features, except
>  > again text properties, to indicate something like that.  In any case,
>  > isn't it beginning to sound more and more complicated?
> Sure.  And the presence of non-graphic characters in the buffer is
> going to make other code more complicated.

Again, LRM is just a character, like ZWNJ and friends.  We need to
support such characters in files anyway.  And we already started, with
the glyphless-char-display feature.

>  > > But if that doesn't work, I don't see how having explicit mark
>  > > characters in the buffer can work either.
>  > 
>  > Explicit marks work because the reordering algorithm does TRT with
>  > them, whether they are redundant or not.  It doesn't care.  By not
>  > caring it makes it very easy to preserve the byte stream and not risk
>  > changing it behind user's back.
> The algorithm will be the same, except that it needs to work with a
> "virtual" stream where some characters are not present in the buffer.
> This is no different from handling faces, which *could* be represented
> as characters in the buffer (and *are* in HTML, for example -- which
> of course has been deprecated in favor of CSS!  Hmm... :-).

Actually, I think dealing with "virtual" characters means at least
lots of changes in Emacs if not larger trouble.  Up to v24, Emacs
assumed that the correspondence between buffer text and text on
display is mostly 1:1.  Sure, display strings, invisible text,
variable fonts, and other display features break that to some extent,
but by and large, this was true.  Emacs 24 changes that some more due
to support of bidi.  But bidi support is _a_display_only_feature_, and
the current design sticks to that almost religiously.  Again, the need
to insert LRM/RLM etc. here and there violates the "display-only"
thing, but one could claim that this is unrelated to bidi display per
se: if we don't care about good looks in specially formatted buffers,
we can disregard this issue; the display will still be "correct" per
the UBA.

This assumption, of the basic 1:1 correspondence between buffer text
and the display, is very fundamental and affects many Emacs features
not directly related to display.  One such set of features is column
counting and the vertical scrolling and indentation features that are
based on it.  If you look under the hood, you will see that some/many
of the functions involved in the implementation of this walk buffer
text, not the display structures.  (Being dependent on display
structures means that the related features cannot work if the display
is not up to date, which is unacceptable.)  Any idea whose result is
"virtual" characters not in the buffer means a tremendous complication
in these features, for reasons that I hope are obvious.  In a
nutshell, a display-only feature will leak all over the code that
works with buffer text.  I won't argue whether this is impractical or
impossible, but I hope you will at least agree that it's undesirable.

>  > The _value_ doesn't matter.  It's the property symbol that cannot be
>  > the same in overlapping regions, unless the values are identical.
> Of course the value matters.  A 'direction property with a sequence
> value can encode the whole stack, up to 61 levels.

Then you'd need to change this value on every edit of the related

> Again, I wouldn't want to maintain that design (space-inefficiency
> and the question of consistency of neighboring regions are killers,
> I think), but there are surely lighter-weight, more efficient
> designs.

I doubt the "surely" part.

> IIUC, in XEmacs, this could easily be implemented with a zero-length
> extent with appropriate stickiness attributes.

I Only know about Emacs stickiness.  With that, this idea will lead to
proliferation of characters with the "mark" value, as text around it
is added/deleted.  You will need to work hard to maintain that so that
there's only one place with that value.

> Thank you very much for taking the time out to explain your reasons
> for your design choices.  I have a much better grasp of the practical
> issues involved in implementing bidi in Emacsen now.

You are welcome.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]