[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bidi-display-reordering is now non-nil by default

From: Eli Zaretskii
Subject: Re: bidi-display-reordering is now non-nil by default
Date: Thu, 04 Aug 2011 20:43:13 +0300

> From: "Stephen J. Turnbull" <address@hidden>
> Cc: address@hidden,
>     address@hidden,
>     address@hidden
> Date: Fri, 05 Aug 2011 01:55:01 +0900
> Eli Zaretskii writes:
>  > Let's stick to the issue at hand: do you consider it a good idea to
>  > remove or add these characters in an otherwise unmodified buffer?
> Er, you're the one who keep bringing up random irrelevancies like
> byte-level equality.

They are not irrelevant.  What you suggest runs the risk of adding or
removing LRM/RLM characters to/from a file against user expectations.

> And your question makes no sense.  In your implementation, they will
> be present as characters in the buffer, and they should neither be
> removed nor added per the Unicode standard.  In a text-property-based
> implementation, on input they will be converted to text properties on
> the characters controlled, and automatically converted back on
> output.  Once again, those characters will neither be removed nor
> added in the buffer.

Again, what if the user inserts another LRM?  In some positions, the
LRM does not change the directionality of the surrounding text, so
your text properties will be identical with or without it.  Then on
output to disk, this LRM will be lost.

>  > > I see no reason why a text-property-based implementation should be
>  > > lossy.
>  > 
>  > Because the user could type directional controls, and there's no way
>  > for Emacs to know at all levels which one is to be treated in which
>  > way.
> What's wrong with reparsing the buffer from the beginning, treating
> each change of value of the direction property as insertion of the
> appropriate direction mark?

Reparsing the whole buffer upon each insertion?  Is that the way to
make redisplay fast and efficient?

> If there are redundant marks, of course they would have to be
> indicated in some way.

How do you indicate them, exactly?  Emacs has no features, except
again text properties, to indicate something like that.  In any case,
isn't it beginning to sound more and more complicated?

> But if that doesn't work, I don't see how having explicit mark
> characters in the buffer can work either.

Explicit marks work because the reordering algorithm does TRT with
them, whether they are redundant or not.  It doesn't care.  By not
caring it makes it very easy to preserve the byte stream and not risk
changing it behind user's back.

These are exactly the considerations that convinced me long ago that
leaving the explicit marks is the only reasonably safe and
uncomplicated way of doing this.

>  > > I don't understand.  If `put-text-property' and friends don't get that
>  > > right already, more than bidi is in trouble, I should think.  What's
>  > > special about bidi?
>  > 
>  > What is special is the fact that bidi needs nested regions with
>  > different values for the same property.  Normally, if you put a
>  > property with a value on a portion of text that has another value for
>  > that property, the new value replaces the old one.
> Sure.  But this is Lisp.  There's nothing that says that you are
> limited to something as simple as 'ltr vs. 'rtl as the property value.
> You could have a rather complex property, eg, containing the level of
> the embedding as well as the resolved direction.

The _value_ doesn't matter.  It's the property symbol that cannot be
the same in overlapping regions, unless the values are identical.

> Or you could simply replace the directional marks with a string on
> the preceding non-mark character containing the mark characters that
> were present in the source.

And then move that string when text is inserted after the preceding
non-mark character, or that character is deleted, yes?  Sounds like

This horse has been beaten to death years ago, and it is still dead,
believe me.  Perhaps someone brilliant could come up with some
elaborate scheme which would somehow solve all these difficulties and
plug all the holes, in theory.  But we have such a simple and natural
alternative that it simply makes no good engineering sense whatsoever
to try to go this way, even if one hopes (and I don't) they will find
a bulletproof solution.

Using explicit marks does have its drawbacks, but they are minor and
mostly just need to get used to.  I'm the last one to disregard
usability considerations, but I'm quite sure users won't be anywhere
near the annoyance that Lars and you are afraid of.

In fact, I have already an experiment under way: one of the Emacs
modes already inserts explicit directional characters to tidy the
display; let's see when the first user complaint about that will

reply via email to

[Prev in Thread] Current Thread [Next in Thread]