[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bidi properties from uniprop tables

From: Stephen J. Turnbull
Subject: Re: bidi properties from uniprop tables
Date: Fri, 19 Aug 2011 18:15:58 +0900

Eli Zaretskii writes:

 > > From: "Stephen J. Turnbull" <address@hidden>
 > > Cc: Kenichi Handa <address@hidden>,
 > >     address@hidden
 > > Date: Fri, 19 Aug 2011 13:44:48 +0900
 > > 
 > >  > I made the code in bidi.c defensive about what it gets from the
 > > 
 > > Maybe that should be an assert, since a null return is an Emacs bug.
 > There's already something that catches such problems, albeit
 > indirectly, and aborts -- that's how I found this in the first place.
 > However, it doesn't make sense to have an assert where the bidi
 > property of a character is looked up as long as we don't make sure
 > this doesn't happen "normally",

The point was to make sure that this doesn't happen normally, as my
understanding was that this is an Emacs bug.

 > because having such an assert now will cause a predictable crash
 > when moving in a buffer created by describe-categories.  People use
 > the development version for their day-to-day work, you know...

Of course I know that; I run with asserts enabled in several of my
mission-critical applications (but save early and often, and have a
stable version ready to substitute).  Doesn't everybody?  So obviously
you fix the bug first, then add the assert.

 > >  > uniprop table, but the question is, should we do something to never
 > >  > have nil in Lisp or zero in C return from these APIs?
 > > 
 > > Yes, a non-nil property list is required by the standard for all code
 > > points
 > It's not a property list, it's a single property whose value is a
 > symbol that shouldn't be nil.  See get-char-code-property.

Then no, you shouldn't do that for all properties, because not all
properties are defined for all characters.  Given that this is Lisp,
it should be possible to discover that from the value returned.

Some properties, however, are defined for all code points.  If the
properties in question are defined for all code points or all
characters (presumably Emacs should never allow a non-character code
point in a string or buffer?), then it's an Emacs bug in
get-char-code-property (or in the underlying table) if it returns nil.

If they aren't, then IMO it's a bug in the calling code that it's not
prepared for a null return, and your "defensive code" in bidi.c
is a correct bug fix.

With respect to the Bidi_Class property, UAX#9 says:

    Unassigned characters are given strong types in the
    algorithm. This is an explicit exception to the general Unicode
    conformance requirements with respect to unassigned characters. As
    characters become assigned in the future, these bidirectional
    types may change. For assignments to character types, see
    DerivedBidiClass.txt [DerivedBIDI] in the [UCD].

Since Bidi_Class is only used in this algorithm (and explicit property
lookups) AFAIK, it seems reasonable to me that get-char-code-property
et amis should return the "strong type" specified by DerivedBIDI
(which is LTR it seems, but you should check that).

reply via email to

[Prev in Thread] Current Thread [Next in Thread]