[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [emacs-bidi] Where do I start?
From: |
Eli Zaretskii |
Subject: |
Re: [emacs-bidi] Where do I start? |
Date: |
Wed, 07 Nov 2001 19:04:04 +0200 |
> From: Alex Schroeder <address@hidden>
> Date: Wed, 07 Nov 2001 16:04:19 +0100
>
> Eli Zaretskii <address@hidden> writes:
>
> > IIRC, Ehud's hebeng.el has a function for something like that, so
> > perhaps most (or even all) of the job is already done. But that's
> > part of the research.
>
> What I found is the following (and related functions).
>
> ELISP> (winvert-string
I'm not sure this is the one, but I'm sure Ehud will tell ;-)
> Some more magic would be required to add the right spaces. The simple
> solutions of replacing "\n" with "\n " or " \n" don't work because
> that results in either two or no spaces.
Sorry, I don't see the problem. If we forget that winvert-list
exists, do you see any special problems with handling a newline?
> Related question: The classification of characters takes place in the
> following lookup function:
>
> (defun get-bidi-type (char)
> "Return the bidi type of the given CHAR.
> It may be A, B, D, I, L, N, R, space or -.
> See help for `hebrew-english-bidi-type'."
[...]
> This should probably be extended.
This is not the UAX#9 classification. I think you should try to work
with what UAX#9 defines, and only add more classes if needed.
The data structure to hold this should probably be a char-table of
some kind, since a string that Ehud used is not an efficient storage
for large sparse arrays. It is good for a small contiguous set of
characters, such as Hebrew, but doesn't scale up well if you add
Arabic and other bidi scripts.
> Where can I find a list of mule characters?
Mule simply takes an iso8859-x set (x=8 for Hebrew, 6 for Arabic) and
adds a constant offset. So to get a map of Hebrew Mule characters,
you need a full iso8859-8 list. You can find one here:
http://www.qsm.co.il/Hebrew/ab.htm
This site includes Hebrew diacriticals and even directional format
codes (RLO etc.). To get the Mule codepoints, add the result of
(- (make-char 'hebrew-iso8859-8) 32) to the iso8859-8 code.
> Do you think that the classification provided by get-bidi-type goes
> far enough?
I think we should be gin with what UAX#9 defines.
> What table did you use for your algorithm -- did you just
> fake it, ie. use a very small table for testing purposes?
The tables to hold this information were not fully designed yet. For
now, I use a C switch statement, which is sufficient for testing
purposes.
I think the data structure to hold this, at least on the Lisp level,
will be some kind of char-table. (It's possible that, for
efficiency, I will write code to process the char-table into a
bitmapped array, of the kind the standard C ctype functions, such as
`isalpha' and `ispunct', use.)
Since you are doing this in Lisp, I think a char-table should be good
enough for now.
> Using capitalization would have been a good idea to test you code --
> you could have taken the test cases verbatim from the report.
That's what I did. But most of my test cases are from the FriBidi
distribution's test suite. Some others I added as I found bugs and
debugged them.
> > - It would be nice if converting from logical to visual and then
> > back would be as close to the original as possible. I think you
> > should be able to reproduce the original exactly if it contains no
> > explicit formatting codes; otherwise, you can't.
>
> I agree. Just to make sure I understand you correctly: the visual
> format never includes any directional formatting codes, right?
Yes, the visual-order text has no directional formatting codes.
- Re: [emacs-bidi] Where do I start?, (continued)
- Re: [emacs-bidi] Where do I start?, Eli Zaretskii, 2001/11/06
- Re: [emacs-bidi] Where do I start?, Alex Schroeder, 2001/11/06
- Re: [emacs-bidi] Where do I start?, Eli Zaretskii, 2001/11/07
- Re: [emacs-bidi] Where do I start?, Alex Schroeder, 2001/11/07
- Re: [emacs-bidi] Where do I start?, Eli Zaretskii, 2001/11/07
- Re: [emacs-bidi] Where do I start?, Uwe Brauer, 2001/11/07
- Re: [emacs-bidi] Where do I start?, Alex Schroeder, 2001/11/07
- Re: [emacs-bidi] tables, Alex Schroeder, 2001/11/07
- Re: [emacs-bidi] Arabic Mule, Alex Schroeder, 2001/11/07
- Re: [emacs-bidi] Arabic Mule, Eli Zaretskii, 2001/11/07
- Re: [emacs-bidi] Where do I start?,
Eli Zaretskii <=
- Re: [emacs-bidi] diacritics, ligatures, etc., Alex Schroeder, 2001/11/08
- Re: [emacs-bidi] diacritics, ligatures, etc., Eli Zaretskii, 2001/11/08
- Re: [emacs-bidi] diacritics, ligatures, etc., Yair Friedman (Jerusalem), 2001/11/12
- Re: [emacs-bidi] diacritics, ligatures, etc., Eli Zaretskii, 2001/11/12
- Re: [emacs-bidi] bidi categories, Alex Schroeder, 2001/11/08
- Re: [emacs-bidi] bidi categories, Eli Zaretskii, 2001/11/08
- Re: [emacs-bidi] bidi categories, Alex Schroeder, 2001/11/09
- Re: [emacs-bidi] bidi categories, Alex Schroeder, 2001/11/09
- Re: [emacs-bidi] bidi categories, Eli Zaretskii, 2001/11/09
- Re: [emacs-bidi] bidi categories, derived from Unicode data, Alex Schroeder, 2001/11/09