[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bidi properties from uniprop tables
From: |
Kenichi Handa |
Subject: |
Re: bidi properties from uniprop tables |
Date: |
Sat, 20 Aug 2011 21:42:20 +0900 |
In article <address@hidden>, Eli Zaretskii <address@hidden> writes:
> > Since Bidi_Class is only used in this algorithm (and explicit property
> > lookups) AFAIK
> That's not true, it is also used in regexp search by category. So we
> should decide whether to assign these types in the uniprop table, or
> have a fallback for them in bidi.c. Any opinions? Handa-san?
As I'm on vacation now, I can't access the source of Emacs,
but I remember that there's a place in an element of
unidata-SOMETHING-alist (I don't remember what SOMETHING is)
to specify the default property value. So, it should be
easy to fix the default value if it is a simple one.
But, the current code doesn't handle the non-simple default
value as below.
> > it seems reasonable to me that get-char-code-property
> > et amis should return the "strong type" specified by DerivedBIDI
> > (which is LTR it seems, but you should check that).
> No, the type depends on the block:
> # Unlike other properties, unassigned code points in blocks
> # reserved for right-to-left scripts are given either types R or AL.
> #
> # The unassigned code points that default to AL are in the ranges:
> # [\u0600-\u07BF \uFB50-\uFDFF \uFE70-\uFEFF]
> #
> # Arabic: U+0600 - U+06FF
> # Syriac: U+0700 - U+074F
> # Arabic_Supplement: U+0750 - U+077F
> # Thaana: U+0780 - U+07BF
> # Arabic_Presentation_Forms_A:
> # U+FB50 - U+FDFF
> # Arabic_Presentation_Forms_B:
> # U+FE70 - U+FEFF
> # minus noncharacter code points.
> #
> # The unassigned code points that default to R are in the ranges:
> # [\u0590-\u05FF \u07C0-\u08FF \uFB1D-\uFB4F \U00010800-\U00010FFF
> \U0001E800-\U0001EFFF]
> #
> # Hebrew: U+0590 - U+05FF
> # NKo: U+07C0 - U+07FF
> # Cypriot_Syllabary: U+10800 - U+1083F
> # Phoenician: U+10900 - U+1091F
> # Lydian: U+10920 - U+1093F
> # Kharoshthi: U+10A00 - U+10A5F
> # and any others in the ranges:
> # U+0800 - U+08FF,
> # U+FB1D - U+FB4F,
> # U+10840 - U+10FFF,
> # U+1E800 - U+1EFFF
> #
> # For all other cases:
> # All code points not explicitly listed for Bidi_Class
> # have the value Left_To_Right (L).
I'll fix the code to handle it when I'm back to work on next
Monday.
---
Kenichi Handa
address@hidden
Re: bidi properties from uniprop tables, Kenichi Handa, 2011/08/23