discuss-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NSString lowercaseString


From: Sebastian Reitenbach
Subject: Re: NSString lowercaseString
Date: Tue, 31 Jul 2012 19:27:59 +0200
User-agent: SOGoMail 1.3.17

On Tuesday, July 31, 2012 19:06 CEST, David Chisnall <theraven@sucs.org> wrote:

> Are you using GNUstep with or without ICU?  When you say skipped, is it 
> removed from the destination, or just passed through unmodified?  Is your 
> locale set to something that recognises letters with umlauts?

It's with ICU, and I run OGo with
LC_CTYPE='de_DE.UTF-8'
so, supposed to recognize Umlauts.

I had some NSLog in GSString lowercase, and without my patch, it returns 0 for 
an Umlaut, so its not really skipped, but the
o->_contents.c[i] is set to 0 in the middle of a string :(

My patch just checks if tolower returned 0, and then just pass the character it 
cannot handle without doing anything with it.

following ICU is installed:
$ pkg_info | grep icu4c                                                         
                                                            
icu4c-4.8.1.1       International Components for Unicode

gnustep is from the latest releases, using libobjc from gcc 4.2.1, if that 
matters.

Sebastian


>
> David
>
> On 31 Jul 2012, at 18:02, Sebastian Reitenbach wrote:
>
> > Hi,
> >
> > with OGo, I convert a UTF-8 string to lowercase, using [NSStrings 
> > lowercaseString]
> >
> > when there are Umlauts in the string, then GNUstep just omits the character.
> > I've no idea, whether this is right or wrong actually.
> >
> > With the attached patch below to GSString it does not omit the character 
> > anymore.
> >
> >
> > gcc -fgnu-runtime -fconstant-string-class=NSConstantString 
> > -I/usr/local/include -L/usr/local/lib -l gnustep-base lowercase.m -o 
> > lowercase
> >
> > cat lowercase.m
> > #import <Foundation/Foundation.h>
> >
> >
> > int main(int argc, char *argv[]) {
> >        NSLog(@"Lowercase: %@", [[NSString stringWithString:@"Töst"] 
> > lowercaseString]);
> >
> > }
> >
> >
> >
> > Does above running the program on a Mac output the ö or omit it from the 
> > string?
> >
> > does it change when running with LC_CTYPE="C" or LC_CTYPE='de_DE.UTF-8' ?
> >
> > I don't have a Mac, so cannot test myself, maybe also the approach used by 
> > OGo could be wrong.
> > At least when reading the Apple docs, then there is nothing said about 
> > skipped characters,
> > only that i.e. a ß may change to SS when i.e. using uppercaseString.
> > Since they mentioned the ß in the documentation, I'd expect the 
> > lowercaseString to handle other Umlauts too, or is that just plain wrong 
> > assumption?
> >
> > if someone could hit me with a cluestick please ;)
> >
> > cheers,
> > Sebastian
> >
> > the patch to not omit Umlauts.
> > $OpenBSD$
> > --- Source/GSString.m.orig  Tue Jul 31 18:31:36 2012
> > +++ Source/GSString.m       Tue Jul 31 18:32:24 2012
> > @@ -3699,6 +3700,8 @@ agree, create a new GSCInlineString otherwise.
> >   while (i-- > 0)
> >     {
> >       o->_contents.c[i] = tolower(_contents.c[i]);
> > +      if (o->_contents.c[i] == 0)
> > +   o->_contents.c[i] = _contents.c[i];
> >     }
> >   o->_flags.wide = 0;
> >   o->_flags.owned = 1;      // Ignored on dealloc, but means we own buffer
> >
> > _______________________________________________
> > Discuss-gnustep mailing list
> > Discuss-gnustep@gnu.org
> > https://lists.gnu.org/mailman/listinfo/discuss-gnustep
>
> --
> This email complies with ISO 3103
>







reply via email to

[Prev in Thread] Current Thread [Next in Thread]