[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs puts binary junk into the clipboard, marking it as text

From: Kenichi Handa
Subject: Re: Emacs puts binary junk into the clipboard, marking it as text
Date: Tue, 19 Sep 2006 16:14:01 +0900
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)

In article <address@hidden>, Jan Djärv <address@hidden> writes:

> > AFAIK, only when TEXT is requested, an selection owner can
> > choose the returning type from STRING, COMPOUND_TEXT, or
> > UTF8_STRING.  When UTF8_STRING is requested, we should
> > return it or return nothing.
> > 
> > And, if Emacs owns a unibyte string, perhaps the right thing
> > is to make it multibyte according to the current
> > lang. env. (by string-make-multibyte) at first, then encode
> > it by utf-8.

> What would that do to illegal UTF-8 sequences in the original unibyte string? 

The original unibyte string won't be in UTF-8 format.  But,
string-make-multibyte will convert it to a correct multibyte
string, thus encoding that multibyte string by UTF-8 will
produce a correct UTF-8 string ... usually.

>   I.e. will this procedure always produce valid UTF-8 data?

No.  If a byte in the original unibyte string is not a valid
code point of the primary charset of the current lang. env.,
string-make-unibyte will produce a multibyte string that
contains eight-bit-control or eight-bit-graphic character.
Then, encoding it by UTF-8 will results in incorrect UTF-8
sequence.  So, for safely, we must delete such eight-bit
characters or replace them with U+FFFD (REPLACEMENT
CHARACTER) before encoding by UTF-8.

Or, in such a case, don't return anything (which means Emacs
doesn't hold a requested data).

Kenichi Handa

reply via email to

[Prev in Thread] Current Thread [Next in Thread]