[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
NSString lowercaseString
From: |
Sebastian Reitenbach |
Subject: |
NSString lowercaseString |
Date: |
Tue, 31 Jul 2012 19:02:07 +0200 |
User-agent: |
SOGoMail 1.3.17 |
Hi,
with OGo, I convert a UTF-8 string to lowercase, using [NSStrings
lowercaseString]
when there are Umlauts in the string, then GNUstep just omits the character.
I've no idea, whether this is right or wrong actually.
With the attached patch below to GSString it does not omit the character
anymore.
gcc -fgnu-runtime -fconstant-string-class=NSConstantString -I/usr/local/include
-L/usr/local/lib -l gnustep-base lowercase.m -o lowercase
cat lowercase.m
#import <Foundation/Foundation.h>
int main(int argc, char *argv[]) {
NSLog(@"Lowercase: %@", [[NSString stringWithString:@"Töst"]
lowercaseString]);
}
Does above running the program on a Mac output the ö or omit it from the string?
does it change when running with LC_CTYPE="C" or LC_CTYPE='de_DE.UTF-8' ?
I don't have a Mac, so cannot test myself, maybe also the approach used by OGo
could be wrong.
At least when reading the Apple docs, then there is nothing said about skipped
characters,
only that i.e. a ß may change to SS when i.e. using uppercaseString.
Since they mentioned the ß in the documentation, I'd expect the lowercaseString
to handle other Umlauts too, or is that just plain wrong assumption?
if someone could hit me with a cluestick please ;)
cheers,
Sebastian
the patch to not omit Umlauts.
$OpenBSD$
--- Source/GSString.m.orig Tue Jul 31 18:31:36 2012
+++ Source/GSString.m Tue Jul 31 18:32:24 2012
@@ -3699,6 +3700,8 @@ agree, create a new GSCInlineString otherwise.
while (i-- > 0)
{
o->_contents.c[i] = tolower(_contents.c[i]);
+ if (o->_contents.c[i] == 0)
+ o->_contents.c[i] = _contents.c[i];
}
o->_flags.wide = 0;
o->_flags.owned = 1; // Ignored on dealloc, but means we own buffer