[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Re: improving bidi documents display

From: Martin J. Dürst
Subject: Re: [emacs-bidi] Re: improving bidi documents display
Date: Wed, 02 Mar 2011 11:09:42 +0900
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: Gecko/20100722 Eudora/3.0.4

Hello Eli,

On 2011/02/28 6:15, Eli Zaretskii wrote:
From: Michael Welsh Duggan<address@hidden>
Cc: Eli Osherovich<address@hidden>,  address@hidden, address@hidden
Date: Sun, 27 Feb 2011 05:01:25 -0500

Memory:  HEBREW \foo{english}
Levels:  11111111222222222221
Display: {foo{english\ WERBEH

The key to a useful discussion of these matters is to decide up front
what do we want to support and what do we want the text to look like.

In this case, someone who knows about (La)TeX much more than I do
should first describe what TeX features would be useful when
typesetting bidirectional text.

With that knowledge in hand, we could then think whether the example
above is at all practical.  For example, most of the problems go away
if paragraphs have left-to-right direction; in that case the display
will be

    WERBEH \foo{english}

Maybe this is already good enough.

In some cases, it will be good enough. But if this is a word or two in a Hebrew paragraph, it will probably be awkward to read.

One simple, naive way of handling this for the various TeXs is to
consider all backslashes and brace characters as R characters.  This can
be simulated by surrounding each run of these characters by LRE PDF
pairs.  However, unless TeX ignores these characters completely, these
formatting characters would have to be removed before being processed by

Again, someone who knows should tell if the bidi formatting codes need
to be removed before TeX'ing the file.

Another way of handling this would be to redefine the backslash and
brace characters as R characters, for purposes of the display engine.
Currently, I don't know if there is a way to do this in elisp.  bidi.c
seems to use a character table named bidi_type_table to hold this
information.  Currently this table is not exposed at the elisp layer, to
the best of my knowledge.  Maybe it would be possible to modify this
table in elisp, and possibly make it buffer local?

I didn't expose the table to Lisp on purpose: messing with
bidirectional properties of characters is asking for trouble.  At
best, you will get text that will look different in any other editor;
at worst, you could easily crash Emacs.

Getting text to look better than in another editor would be a good idea. Crashing Emacs would be bad, but that would reveal a bug, or not? Anyway, if at all, setting bidi properties of characters would have to be done on a buffer-by-buffer (or mode-by-mode) level, not once and for all for a running instance. Even then, it will only allow to take care of very local phenomena (which may not work for multiple-level embeddings), and it will only work one way for the whole buffer (which may not work if there are paragraphs of varying directionality).

Another idea would be to allow a text property to override the character

Overlay, not text property.  The latter modifies the buffer, which is
not what you want in this case.

Just a factual question: What does it mean when you say that properties modify the buffer? For example, I'd expect that "modifies the buffer" means that these modifications get saved when the buffer gets saved, but there are lots of properties that I would have no idea of how they get saved when the text is saved as plain text (as is usual for Emacs).

This feels like a very elegant, emacs-ish way to do things, but
an uneducated glance at the bidi code makes me feel like it would be
difficult to get information about text properties into this layer.

You are looking at this from a wrong perspective.  The bidi reordering
engine doesn't need to access text properties or overlays; rather, the
display code should tell the reordering engine what to reorder.  The
reordering code already honors point-min and point-max, so all it
takes to do what you want is narrow the buffer to the portion of text
we want to reorder.  These portions could be marked by an overlay; the
display code already examines overlays as it goes about its job.

I think this would work for simple cases, but for more complex cases (e.g., several hierarchical levels of embeddings), it's impossible to set three different levels of point-min and point-max.

Another idea would be to use display strings including the LRE and PDF
characters to replace existing backslashes and braces.

This is similar to what we are doing, although we leave the syntactic characters (backslashes and braces for LaTeX) displayed as is, and insert display strings with Bidi control characters before and after.

However, display
strings do not affect the bidi algorithm at this point.

I need a few rainy days to implement support for display strings.

Let's hope for some rain in your area :-).

However, it would be a mistake to base large portions of buffer
display on display strings, because they make redisplay too expensive.

What are 'large portions of buffer'? Our current implementation restricts its work to the portion of the buffer that is currently actually displayed. This means that if you have a 1MB file and a 100x100 character display, only 1% of the buffer actually has overlays (but it may have quite a few).

Regards,    Martin.

#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]