[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
NSLog does only ASCII or UTF-8! Was: Unicode and GNUstep (more info)
From: |
Pascal Bourguignon |
Subject: |
NSLog does only ASCII or UTF-8! Was: Unicode and GNUstep (more info) |
Date: |
Sat, 11 May 2002 19:33:38 +0200 (CEST) |
> >> cat delivers the same result. I then changed the encoding in the
> >> preferences of Terminal.app to MacOS Roman and retried. cat and pico
> >> did
> >> now deliver the expected result. So far so good.
> >
> > A good move. If you're going to work only with the terminal and from
> > Macintosh, that is.
>
> Ok, Terminal.app is set to "MacOS Roman". I have rewritten my program to
> just do
>
> NSLog(@"Höschler");
>
>
> > But if, like it seems, you have Macintosh encoded files and sources,
> > then you will need to use:
> >
> > export GNUSTEP_STRING_ENCODING=NSMacOSRomanStringEncoding
> > ./FBTest.app/FBTest
>
> Here is somethinh i do not understand now.
>
> bash-2.03$ export GNUSTEP_STRING_ENCODING=NSMacOSRomanStringEncoding
> bash-2.03$ ./FBTest
> May 11 17:00:34 FBTest[14036] H¬öschler
Somewhat better isn't it. Now, it seems that NSLog only outputs
UTF-8, hence the 195 escape code before the 246 translated by Terminal
to a Macintosh ö.
> bash-2.03$ export GNUSTEP_STRING_ENCODING=
Use:
unset GNUSTEP_STRING_ENCODING
to remove an environment variable in bash.
> bash-2.03$ ./FBTest
> WARNING: - encoding not supported.
> NSISOLatin1StringEncoding set as default.
> May 11 17:02:10 FBTest[14039] Höschler
>
> I have expected to get "Höschler" while the encoding is set to
> NSMacOSRomanStringEncoding. However, I get H¬öschler instead. When I
> reset the encoding to its default I get Höschler. Weird!
Well, it seems that NSLog does not honor the GNUSTEP_STRING_ENCODING.
Effectively, we find:
--------------------------------------------------------------------
static void
_NSLog_standard_printf_handler (NSString* message)
{
NSData *d;
const char *buf;
unsigned len;
d = [message dataUsingEncoding: NSASCIIStringEncoding
allowLossyConversion: NO];
if (d == nil)
{
d = [message dataUsingEncoding: NSUTF8StringEncoding
allowLossyConversion: NO];
}
--------------------------------------------------------------------
in NSLog.m.
Shouldn't it use _DefaultStringEncoding instead of NSUTF8StringEncoding
there?
Perhaps the question may be asked if we need two different encodings,
one for input and in code strings (GNUSTEP_STRING_ENCODING), and
another one for output. I guess that for GUI output, it depends on the
font, so all is well. How is handled the keyboard input ?
I note in gdl, that converting between NSStrings and NSData is done
with cString and stringWithCString, that is, GNUSTEP_STRING_ENCODING
is honored there. What about file I/O?
For NSLog, I'd move to use _DefaultStringEncoding in NSLog since it's
a small change and would give the correct results.
> I slowly get the idea of avoiding any characters beyond 127 in source
> code.
No need to ; your output shows that GNUSTEP_STRING_ENCODING is
correctly taken into account for the literal strings. Have a look at
the following test program.
hoeschler.tar.gz
Description: GNU Zip compressed data
With: make test you can see that it works as expected : we get 0xe0,
0xe1, 0xe2, etc when we put an ISO-8859-1 string and
GNUSTEP_STRING_ENCODING to ISO-8859-1 too, or when we put a Macintosh
string and GNUSTEP_STRING_ENCODING accordingly, etc.
The only problem here is with NSLog outputing only ASCII or UTF-8.
--
__Pascal_Bourguignon__ (o_ Software patents are endangering
() ASCII ribbon against html- //\ the computer industry all around
/\ email & Microsoft attachments. V_/ the world http://lpf.ai.mit.edu/
1962:DO5K=1.3 2001:my($f)=`fortune`; http://petition.eurolinux.org/
You're a web designer? Please read http://www.useit.com/alertbox/ !
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/IT d? s++:++(+++)>++ a C+++ UB+++L++++$S+X++++>$ P- L+++ E++ W++
N++ o-- K- w------ O- M++$ V PS+E++ Y++ PGP++ t+ 5? X+ R !tv b++(+)
DI+++ D++ G++ e+++ h+(++) r? y---? UF++++
------END GEEK CODE BLOCK------
- NSLog does only ASCII or UTF-8! Was: Unicode and GNUstep (more info),
Pascal Bourguignon <=