[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bidi-display-reordering is now non-nil by default

From: Eli Zaretskii
Subject: Re: bidi-display-reordering is now non-nil by default
Date: Thu, 18 Aug 2011 11:21:03 +0300

> From: Chong Yidong <address@hidden>
> Date: Wed, 17 Aug 2011 18:32:46 -0400
> Cc: address@hidden, address@hidden
> Eli Zaretskii <address@hidden> writes:
> > I'm afraid making the reordering engine aware of all text properties
> > will considerably slow down redisplay, due to the need to check
> > character properties very frequently.  It also runs a high risk of
> > completely blending the reordering code with the display engine, which
> > will make them both very hard to maintain; currently, they are clearly
> > separated.
> No, the lookup would be done at the redisplay engine level, not the
> reordering engine level: add a new entry in it_props[] for handling a
> (say) `bidi-override' text property.  Emacs would process this during
> the step in redisplay where it handles other properties (like faces and
> invisibility), and record the information into the iterator.  The bidi
> code would take it from there.

This won't work, not with the way the reordering engine is currently
integrated with redisplay.  The reason is that above the reordering
level, the iteration through buffer text is non-linear.  Your
suggestion assumes that the redisplay iterator will bump into this new
text property _before_ it processes the text which follows it.  But
this assumption is false because of the non-linear scanning of the
buffer text.

Let me show an example to illustrate how the bidirectional display
handles text properties.  Suppose you have the following buffer text
(as usual, capital letters mean R2L characters):

   abcde ABCDE xyz
         ^    ^

The number above each character shows the text properties of the
characters; 0 means no properties, 1 means some specific property.
This example shows only one property, spanning only the R2L
characters; the real-life examples can be much more complex.  The '^'
characters below show the "stop positions" computed by the iterator --
those are the buffer positions where display engine should process the
text property by calling one or more handlers in the it_props[] array,
filling the iterator with attributes necessary for displaying the text
until the next "stop position".

To move from the blank character between `e' and `A' to the next
character in visual order, the display iterator calls the reordering
engine.  When it does that, the first (leftmost) "stop position" was
not yet acted upon, because the current iterator position is smaller
than that stop.  When the call to the reordering engine returns, it
sets the iterator position at `E', since the ABCDE part should be
displayed as EDCBA on the screen.  Oops! we just missed the "stop
position".  What happens next is the redisplay engine realizes that
the stop position was missed, so it scans back to find the last "stop
position" preceding `E' (since there could be other text properties or
overlays in-between), and then handles it using the handlers in
it_props[]; see handle_stop_backwards for how this is done.  Then it
can deliver `E' with the right attributes, and continue delivering all
the successive characters, until it crosses some "stop position"
again, either going forward or backward.

This is why it won't work to control reordering with text properties:
by the time the redisplay engine realizes that there's another text
property to apply, a crucial part of reordering has already happened.
The bidi_it structure that is part of the iterator already has all the
information about reordering of "ABCDE", having scanned it all inside
a single call to bidi_move_to_visually_next.  That scan entirely
ignores all text properties except one: the `display' property, and
then only if its value will cause the covered text to be replaced by
something else, like an image or a string.

It would be possible, of course, to have the handler of the
`bidi-override' property to toss all the reordering information,
reposition to before `A' and start anew.  But that's a terrible waste
of cycles, especially if the text covered by that property is not so
short.  The waste is not only in that we will have to throw away
information we already gathered at some cost, but also because
repositioning the iterator to an arbitrary place means we need to
restart the bidi iteration from the beginning of the line in order to
have the correct state of the bidi iterator needed to continue from
that place; see get_visually_first_element for the details.

> >> Then it should be easy to exploit font-lock to give reasonably correct
> >> bidi segmentation, e.g. by treating font-lock-comment-face and
> >> font-lock-string-face boundaries as bidi segmentation boundaries.
> >
> > We should be very careful with reusing font-lock as basis for
> > reordering, because the user has too much knobs to control font-lock.
> > For example, few of the font-lock features speed up redisplay by
> > deferring fontification to a later time.  With font-lock, this just
> > displays text in the default face; with reordering, it will flush
> > incorrectly rendered text for a perceptible amount of time.  I'm not
> > sure it's a good idea.
> The fundamental issue is that correctly segmenting source code requires
> knowledge of the underlying syntax.  Sure, it's possible to come up with
> some hacks that "mostly" work, but font lock is already there, so we
> ought to try to use it first.

Font-lock just uses regexps and syntax tables.  Everything else in
font-lock is meant to avoid the annoyingly long delay it takes to
fully fontify a large buffer.  What I'm saying is that, apart from
using regexps and syntax tables, the considerations and trade-offs
that are valid for font-lock are not necessarily valid for
bidirectional display.

> For this reason, I'm not about concerned about the deferred
> fontification issue: if you want Emacs to segment properly, you'd want
> it to do an amount of work equivalent to font-lock anyway.

Amount of work is the least of my concerns in this regard.  I'm
worried about the effect the temporarily incorrect display will have
on users.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]