classpath-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [cp-patches] FYi: Implement (naive) Bidi.requiresBidi() workaround


From: Mark Wielaard
Subject: Re: [cp-patches] FYi: Implement (naive) Bidi.requiresBidi() workaround
Date: Sat, 24 Dec 2005 11:10:00 +0100

Hi Tom,

On Fri, 2005-12-23 at 14:53 -0700, Tom Tromey wrote:
> >>>>> "Mark" == Mark Wielaard <address@hidden> writes:
> 
> Mark>    * Returns false if all characters in the text between start and end
> Mark>    * are all left-to-right text.  WARNING, this implementation is
> Mark>    * slow, it calls <code>Character.getDirectionality(char)</code> on
> Mark>    * all characters.
> 
> I'm not sure that this is particularly slow.
> Character.getDirectionality mostly does bit-twiddling and array lookups.

I see, this is indeed faster then I imagined, I thought we were doing a
little search for each char.

> Mark>         if (Character.getDirectionality(c) != LEFT_TO_RIGHT)
> Mark>           return true;
> 
> I think there are a number of directionality-neutral characters as
> well.  For instance a paragraph separator character is neutral, and if
> one is seen, IMO, this function should not return true.

I think that since the bidi algorithm works on a single paragraph you
will have to do analysis when the text contains a paragraph separator.

But you are right that we could/should probably add most things from the
weak and neutral category, which we know won't "disrupt"
left-to-rightness:

DIRECTIONALITY_EUROPEAN_NUMBER (EN)
DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR (ES)
DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR (ET)
DIRECTIONALITY_ARABIC_NUMBER (AN)
DIRECTIONALITY_COMMON_NUMBER_SEPARATOR (CS)
DIRECTIONALITY_SEGMENT_SEPARATOR (S)
DIRECTIONALITY_WHITESPACE (WS)

I am not sure we should test for the others. I have been conservative
with the above list (just so I don't have to read the whole bidi
algorithm description). The idea behind requiresBidi() is that it is a
quick way to determine whether to do full bidirectional analysis or not
(or actually if the whole paragraph text is written left-to-right). So
false positives aren't really a problem. It just means that you have to
follow the full algorithm to get the full answer.

>   Also see what the (java) spec has to say about arabic presentation
> forms -- they aren't considered to require bidi treatment.

That is just weird. Most of those characters have directionality
DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC (AL) or
DIRECTIONALITY_OTHER_NEUTRALS (ON) or are yet undefined. Why would you
assume these are written left-to-right?

Cheers,

Mark

Attachment: signature.asc
Description: This is a digitally signed message part


reply via email to

[Prev in Thread] Current Thread [Next in Thread]