bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: horrible utf-8 performace in wc


From: Bo Borgerson
Subject: Re: horrible utf-8 performace in wc
Date: Thu, 08 May 2008 07:52:20 -0400
User-agent: Thunderbird 2.0.0.12 (X11/20080227)

Pádraig Brady wrote:
> Bo Borgerson wrote:
>> I poked around a little in gnulib and found a function for determining
>> the combining class of a Unicode character.
>>
>> I think the attached patch does what you were intending to do, and it
>> also counts all of the stand-alone zero-width characters you found:
> 
> cool, thanks.
> Could you could optimize it though and do the following
> as you've already calculated wcwidth().
> 
>   if (!width && uc_combining_class(wide_char))
>     chars--;

Nice, good idea.

I think I may have worded my previous message in a misleading way.  The
intent of the attached patch was not to be a robust solution to the
problem, but rather a demonstration of the function I noticed in gnulib
in case it might be helpful to you.

You definitely seem to know a whole lot more about what's actually
involved here than I do.  I'm just trying to grease the skids. ;)

Bo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]