|Subject:||bug#27525: 25.1; Line wrapping of bidi paragraphs|
|Date:||Fri, 21 Jul 2017 12:44:40 +0300|
> From: Itai Berli <address@hidden>
> Date: Fri, 21 Jul 2017 09:19:25 +0300
> Now that I have downloaded the source code, I'd like to take a look at this problem first hand. I'm not a
> programmer, not even an amateur one, but I can sometimes make sense of the general gist of code when I
> read it, and I'd like to take a look at the part of code that's responsible for the present bug, maybe put a
> breakpoint here and there and give it a test run to get a feel of how it works, and why it misses the mark when
> it comes to line wrapping bidi paragraphs.
> Could you please give me some pointers: what files should I look into, what functions should I read, possibly
> even suggestions for where to put breakpoints and which variables to watch. I'm not asking for a
> comprehensive and detailed run down of this feature; just a starting point(s). Every tip and suggestion will be
The relevant files are bidi.c and xdisp.c. There's a long comment at
the beginning of xdisp.c, whose last parts deal with how the bidi
reordering is incorporated into the display engine, and a long comment
at the beginning of bidi.c that has more details about the reordering
Note that this is not an implementation bug, it's a consequence of how
the bidi reordering engine's integration with the rest of the display
code was designed: we reorder text for display _before_ making the
layout decisions. IOW, the layout layer of the display engine is fed
characters in _visual_ order, already reordered by bidi.c functions
which the layout layer calls when it needs another character. The
advantage of this design is that the display engine knows almost
nothing about the reordering stuff, it doesn't care about resolved
levels etc., because all that was already taken care of.
To make line-wrapping do what the UBA describes, we would need to feed
the display engine with characters in logical order, but record with
each character its resolved bidi level, resulting from partial
processing by bidi.c. Then, when a line is completely laid out, we'd
need to reorder the glyphs prepared for that line according to UBA
rules L1, L2, and L4, using the resolved levels recorded by bidi.c
code. (L3 is tricky, because combining marks are applied when
producing glyphs, so it has to be solved by "some other method".)
The above means we need to redesign the interface between xdisp.c and
bidi.c, and then rewrite the current reordering function into
something that will work on the glyphs of a laid-out line.
That in itself is more or less straightforward refactoring of the
existing code, but unfortunately it isn't the scary part of the job.
The scary part is all the subtleties of the Emacs display engine and
the features it provides, when bidirectional text is involved. For
example, many places need to calculate layout metrics without
displaying anything. A typical example is vertical-motion when
line-move-visual is in effect -- it needs to determine what buffer
position is displayed one screen line up or down from a given
character. Another example is how we process a mouse click, which
starts by determining which buffer position (more accurately, which
offset of what object) is displayed at given pixel coordinates.
These places use functions that "simulate" display -- they perform all
the layout calculations, but don't create glyphs (because nothing
needs to be displayed). Since glyphs are not created, the "line" to
be displayed doesn't exist, and thus the reordering step will have
nothing to work on. Whoever will work on fixing line-wrapping will
have to figure out how to solve this problem in a way that is
compatible with the 2nd sentence of the UBA's section 3.4. There are
many complications in this part of the display code, because
oftentimes Emacs ends the display "simulation" before reaching the end
of the line, and sometimes even starts it in the middle of a line.
All this needs to be figured out and implemented when reordering needs
to see a full screen line, and implemented in a way that doesn't hurt
performance in any significant way.
Then there are complications with invisible text: the 'invisible' text
property can start and/or end in the middle if non-base embedding
level, and the question is how to produce the result that the user
expects, when some of the characters that affect reordering are
effectively hidden from the reordering code, because the invisible
text is simply skipped and never fed to the layout layer. (With the
current design, reordering is done before the text invisibility is
considered, so the result is quite naturally the expected one.)
Similar problems arise with display properties and overlays which hide
portions of buffer text, optionally replacing them with some other
text or image -- the reordering step will somehow need to avoid
reordering the text of a display string as if it were part of the
surrounding buffer text, because that's not what the user expects.
Another complication is where glyph production and layout decisions
are mixed with bidi level resolution. One such situation is how we
implement the display property of the form '(space :align-to HPOS)'
which is treated as a paragraph separator for the purposes of bidi
reordering (thus supporting display of tables with bidirectional
text). If we separate reordering from level resolution, this will
have to be rethought if not reimplemented.
And I'm quite sure there are other complications that I forget. This
is what took the lion's share of the work on making the display engine
bidi-aware (because the basic reordering engine which is now bidi.c
was written and debugged, as a stand-alone program, 15 years ago).
Whoever will work on fixing the line-wrapping issue will have to do at
least part of that anew. I surely hope a motivated individual will
step forward for the job at some point, but they need to know what
they will face.
|[Prev in Thread]||Current Thread||[Next in Thread]|