bug-libunistring
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-libunistring] UAX #29 changes


From: Ben Pfaff
Subject: Re: [bug-libunistring] UAX #29 changes
Date: Sun, 29 Oct 2017 10:18:32 -0700
User-agent: Mutt/1.5.23 (2014-03-12)

On Thu, Oct 26, 2017 at 04:47:35PM +0200, Daiki Ueno wrote:
> Daiki Ueno <address@hidden> writes:
> 
> > I have been trying to update libunistring to Unicode 9.0.0.  Initially I
> > planned it for the end of this month, but now I'm almost giving up,
> > because of the recent additions to the UAX #29 algorithms:
> >
> > - The 3 rules added to the Grapheme Cluster Boundary Rules, namely
> >   (GB10, GB12, GB13), involve 3 consequent characters, while the current
> >   API uc_is_grapheme_break() only takes 2 characters
> >
> > - The similar rules are also added to the Word Boundary Rules.  Though
> >   it wouldn't be a problem as uniwbrk.h doesn't expose such API, the
> >   implementation of WB15 and WB16 could be complicated because it
> >   requires lookahead of a next character
> 
> As I had some time this week, I resumed this work.  Thanks to the help
> of my colleagues, the above new rules involving 3 or more characters are
> now implemented without breaking the ABI.
> 
> For the Grapheme Cluster Boundary rules, u*_grapheme_breaks have been
> rewritten to be more generic, taking into account of the entire
> sequence.  The other API functions are still kept, but have limitations
> due to the number of arguments.
> 
> Bruno, Ben, could you take a look at the attached patch, when you have
> time?

I'm impressed.  I have not looked carefully at the whole patch.  That is
partly because of my time constraints, but it is also partly because I
get patch rejects when I apply the patch to the tip of master for
gnulib.  To what commit should I apply the patch?

Thanks,

Ben.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]