[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: `decode-coding-string' question

From: Paul Pogonyshev
Subject: Re: `decode-coding-string' question
Date: Thu, 6 Jul 2006 23:34:21 +0300
User-agent: KMail/1.7.2

Eli Zaretskii wrote:
> > From: Paul Pogonyshev <address@hidden>
> > Date: Thu, 6 Jul 2006 18:52:28 +0300
> > Cc: Kenichi Handa <address@hidden>
> > 
> > > > I do.  But I need to know where they begin in the buffer (containing
> > > > the encoded C string.)  I don't see a way to keep this information at
> > > > present... :(
> > > 
> > > How did you make that buffer?  Why don't you have an
> > > already-decoded text in that buffer?
> > 
> > Because it's a C source file.  Strings have to be encoded there.
> Paul, there's some misunderstanding here, so please bear with us.
> Handa-san cannot understand how come you have undecoded characters in
> the buffer, and neither can I.
> The fact that it's a C file does not matter: Emacs _always_ decodes
> characters when it visits the file, no matter if it's a C file or
> something else.  In the text you get in your buffer the characters
> should be decoded.  The question is, how come it didn't decode these
> characters in your case?  Are there other non-ASCII characters in the
> same file, perhaps? if so, what characters are those?  For that
> matter, can you post a small sample file that, when visited in Emacs,
> leaves the UTF-8 encoded characters undecoded in the buffer?  Please
> post that file as a binary attachment, to avoid munging it by email
> software en- and de-coding.

There is indeed a misunderstanding.  The characters in the buffer _are_
decoded.  However the characters form C escape sequence, like "\xc2\xa9".
To know what character is encoded by this C sequence, I first translate
strings "\xc2" and "\xa9" to the appropriate (undecoded!) characters.
The resulting string of length 2 is encoded in UTF-8 and I decode it
to receive the copyright character or whatever.

Phew.  Hope it is clearer now.  Anyway, it is not so important for me
anymore, since gettext doesn't support non-ASCII characters in
untranslated strings with fairly recent GNU libc.  (And yes, I tried
inserting non-ASCII characters in the untranslated strings.)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]