emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[emacs-bidi] Re: RTL support


From: Gregg Reynolds
Subject: [emacs-bidi] Re: RTL support
Date: Tue, 22 Nov 2005 22:07:53 -0600
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)

Benjamin Riefenstahl wrote:
Hi Gregg,

Hi Benny,

Thanks for your reasoned reply.  Comments below.

Gregg Reynolds writes:

1.  It was legacy, so Unicode had so support it.  Then they went
   berserk with it.


From my POV, there are very good reasons to consistently encode
characters in the order in which they are written.  You don't want
visual layout for any other operation except display.  You might think
that display is the most important operation on text, but for large
bits of most software it isn't.

Two things. One is, directionality a design choice, not a reflection of some kind of objective reality. This is obvious if you stare at some RTL text and think for a while. However, the Unicode book claims that RTL languages are "inherently" bidirectional. This is hogwash.

Second, "the order in which [characters] are written" is not relevant to an encoding model. There is no necessary relationship between the IO model implemented by an application and the corresponding textual representation, which is application independent. Specifically, your editor can support data entry of digit strings as either LSD-first or MSD-first, or both. Neither data entry protocol has anything to do with the way the data is encoded in persistent storage. For that matter, the internal encoding of an editor is independent of the data exchange formats it im/exports. Emacs being a great example of that.

In other words "reasons to consistently encode characters in the order in which they are written" is essentially meaningless. (I say that as a statement of fact, not as a flame.)


You might think that RTL without bidi would be enough.  But once you
have RTL, it becomes the job of the Unicode standard to define how
mixed content is handled.  Mixed content is after all the driving
force for Unicode in the first place.  I also think that most users

Hmm. I think that's debatable. I think unification of diverse encoding schemes is the primary driver behind Unicode, but that's a digression. More important is that RTL has no necessary relationship to mixed content or bidi reordering. If you only ever write documents in Arabic (Hebrew, Persian, Pashto, whatever) then why do you need bidi? You don't; it's an unfortunate artifact of Western-driven standardization.

To be clear: monolingual Arabic text is not mixed content, whether it contains digit strings or not. So why should an Arabic user pay the Unicode tax of bidi support?

Don't get me wrong, I'm not saying the bidi algorithm is not useful or nice to have. But it's an add-on, not needed by the vast majority of RTL documents produced in the world. Yes, believe it or not, Arabs and other RTL users actually don't need English, any more than we English speakers need Arabic. To this day, scholarly writings about Arabic in English use transliteration. Arabic is quite capable of the same, even for acronyms like IBM or CIA.

It boils down to an economic argument. For Arabic, we need a) RTL layout (a purely graphical matter); and b) shaping. Both of these are (relatively) inexpensive to implement. Support for bidi reordering is a nice enhancement, but it's a) expensive; and b) unecessary unless you write in two or more languages in the same doc.

Ask yourself a simple question. Software like Emacs has been around for what, 30 years? It gained support for e.g. Japanese, Korean, etc. years ago. But the 1 billion + people in the world who need RTL support are still waiting. Why is that? IMHO, it's at least partially because of the perceived but false association of RTL and bidi. (I can cite specific examples of vendors declining to support Arabic solely because of the expense of implementing bidi support.) The bidi algorithm is complex and generally yucky. Thought experiment: imagine a world in which nobody would implement English language software unless it had bidi support.

Sincerely,

-gregg




reply via email to

[Prev in Thread] Current Thread [Next in Thread]