emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Suboptimal display-reordering in minibuffer


From: Martin J. Dürst
Subject: Re: [emacs-bidi] Suboptimal display-reordering in minibuffer
Date: Fri, 02 Jul 2010 10:04:35 +0900
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.4pre) Gecko/20091214 Eudora/3.0b4

Hello Eli,

On 2010/07/02 2:45, Eli Zaretskii wrote:
Date: Thu, 01 Jul 2010 15:37:35 +0900
From: "Martin J. Dürst"<address@hidden>
CC: address@hidden, address@hidden

Hello Eli,

On 2010/07/01 12:14, Eli Zaretskii wrote:

You are suggesting to insert bidirectional format characters into the
buffer text in order to affect the display.  That's a no-no, IMO:

I agree that that's a no-no, for the reasons you give below. But that's
not what I was suggesting or thinking about.

Sorry.

No problem. I should have been clearer.

  What I was suggesting
(actually, the idea is originally from Kenichi Handa and/or Naoto
Takahashi) is that these bidirectional formatting characters go into the
text only 'virtually', e.g. in the before-string or after-string
properties of an overlay (see
http://www.gnu.org/software/emacs/elisp/html_node/Overlay-Properties.html#Overlay-Properties).
In that way, In my understanding, they are not part of the text buffer,
and will not be saved when saving the file.

Got it.

Of course, if the characters in the overlay properties before-string and
after-string are not currently taken into account when running the bidi
algorithm, then that approach may not work very easily.

You are right: they aren't taken into account.  I have yet to code
support for reordering text in display strings.  To add this feature,
I will need to solve quite a few problems.  Until I do, I won't know
whether what you suggest is even doable with a reasonable effort.

I also think that, even if doable, this is a somewhat hackish
solution.

One thing that we should think about is what people want to happen if there is actual displayable text in some of these strings. I don't have much of an idea where this is used, but I can imagine that at least in some usage scenarios, one might want the text added via an overlay to be rendered in exactly the same way as the text in the buffer. In that case, it's about user requirements, even if the solution might involve some hacks.

I think having a special text property that covers the text
that needs to be reordered is a cleaner solution.

It's definitely also a viable solution, although there also might be some tricky issues. Say you have a property defining an embedding from characters 10 to 30, and another such property from characters 20 to 40. What exactly is that supposed to mean?

In any way, I think it's better to use the concepts already available in
the Unicode Bidi algorithm (override, embedding, marks) for improving
the display of XML, HTML, and other structured data and program source,
rather than to invent completely new concepts. Whether these concepts
then get transferred to the bidi algorithm via the (faked) insertion of
characters or via some other way (one could imagine to have properties
such as LRO/RLO/LRE/RLE on overlays,...) may be a secondary issue.

I think the upcoming Unicode 6.0 is already headed in that direction.
See http://www.unicode.org/reports/tr9/proposed.html#HL1.  The text
above this explicitly says that these provisions are for XML, HTML,
and other structured text.

HL1 is indeed being reworked, but even without that rework, it already provides the necessary leeway for what we want to do.

And please note that if we find out that something in 4.3, Higher-Level Protocols, doesn't work for us, we can always ask for an addition or clarification/correction. For example, in the context of programming languages or HTML/XML, the sentence at the end of 4.3, "When text using a higher-level protocol is to be converted to Unicode plain text, for consistent appearance formatting codes should be inserted to ensure that the order matches that of the higher-level protocol.", may be extremely counterproductive. I already have written to the relevant Unicode mailing list.

So I think we will be fine doing it in Emacs.

1) it is easier for "application-level" emacs-lisp programmers who work
on updating modes to improve bidi display.
2) it is easier for the core implementer(s), i.e. you, because they have
to work with only one algorithm.

I don't intend to change the bidi reordering engine in any significant
way, to support these features.  All that's needed is a possibility to
tell it "restrict yourself to region between buffer positions P1 and
P2".  Actually, it just descended on me that I can easily do that with
`narrow-to-region', since the reordering engine already honors that,
it never goes out of the accessible portion of text.

I'm not sure I understand, but if it means that the bidi algorithm is just applied piecewise, that won't be enough. It may be enough for some simple cases, such as C programs, where the main concern is to keep text within string constants together, and the rest is ASCII only and therefore goes LTR. However, on the other hand, with some XML markup with e.g. element and attribute names in Hebrew, in our experience actual nestings (i.e. embeddings in terms of the bidi algorithm) are highly desirable.


I think there are also other ways of attacking the problem. What about, for example, a property on characters that increases the embedding level in a certain way? Or a property that changes the bidi category of a character?

Regards,    Martin.


--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]