[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [cp-patches] FYi: Implement (naive) Bidi.requiresBidi() workaround
From: |
Mark Wielaard |
Subject: |
Re: [cp-patches] FYi: Implement (naive) Bidi.requiresBidi() workaround |
Date: |
Sat, 24 Dec 2005 11:10:00 +0100 |
Hi Tom,
On Fri, 2005-12-23 at 14:53 -0700, Tom Tromey wrote:
> >>>>> "Mark" == Mark Wielaard <address@hidden> writes:
>
> Mark> * Returns false if all characters in the text between start and end
> Mark> * are all left-to-right text. WARNING, this implementation is
> Mark> * slow, it calls <code>Character.getDirectionality(char)</code> on
> Mark> * all characters.
>
> I'm not sure that this is particularly slow.
> Character.getDirectionality mostly does bit-twiddling and array lookups.
I see, this is indeed faster then I imagined, I thought we were doing a
little search for each char.
> Mark> if (Character.getDirectionality(c) != LEFT_TO_RIGHT)
> Mark> return true;
>
> I think there are a number of directionality-neutral characters as
> well. For instance a paragraph separator character is neutral, and if
> one is seen, IMO, this function should not return true.
I think that since the bidi algorithm works on a single paragraph you
will have to do analysis when the text contains a paragraph separator.
But you are right that we could/should probably add most things from the
weak and neutral category, which we know won't "disrupt"
left-to-rightness:
DIRECTIONALITY_EUROPEAN_NUMBER (EN)
DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR (ES)
DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR (ET)
DIRECTIONALITY_ARABIC_NUMBER (AN)
DIRECTIONALITY_COMMON_NUMBER_SEPARATOR (CS)
DIRECTIONALITY_SEGMENT_SEPARATOR (S)
DIRECTIONALITY_WHITESPACE (WS)
I am not sure we should test for the others. I have been conservative
with the above list (just so I don't have to read the whole bidi
algorithm description). The idea behind requiresBidi() is that it is a
quick way to determine whether to do full bidirectional analysis or not
(or actually if the whole paragraph text is written left-to-right). So
false positives aren't really a problem. It just means that you have to
follow the full algorithm to get the full answer.
> Also see what the (java) spec has to say about arabic presentation
> forms -- they aren't considered to require bidi treatment.
That is just weird. Most of those characters have directionality
DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC (AL) or
DIRECTIONALITY_OTHER_NEUTRALS (ON) or are yet undefined. Why would you
assume these are written left-to-right?
Cheers,
Mark
signature.asc
Description: This is a digitally signed message part