[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode support for the MS Windows clipboard

From: Benjamin Riefenstahl
Subject: Re: Unicode support for the MS Windows clipboard
Date: Thu, 03 Jun 2004 11:17:29 +0200
User-agent: Gnus/5.1001 (Gnus v5.10.1) Emacs/21.3.50 (gnu/linux)

Hi all,

Benjamin Riefenstahl <address@hidden> writes:
> - If selection-coding-system has the form /(.*-)?utf-16.*/, I assume
>   CF_UNICODETEXT is wanted.
> - If selection-coding-system has the form /cp[0-9]+.*/ or
>   /windows-[0-9]+.*/, I derive the codepage from that.
>     - Check if the codepage is identical to GetACP() or GetOEMCP().
>       If it is, use CF_TEXT or CF_OEMTEXT accordingly. 
>     - Else get a corresponding LCID (reverse mapping via
>       EnumLocales()) which has the codepage as OEM or "ANSI".  In this
>       case we also need to set LC_LOCALE accordingly.

I implemented this in the attached patch.  I have tested it on W2K, 95
and 98SE.  Please tell of any problem the code has or of other
improvements I can make.

I have a couple of notes:

The C code constructs it's own cpXXXX-dos coding system names in one
place.  I would like to check that the constructed coding system
actually exists, but I don't know how yet.

I didn't worry about the console encoding (CF_OEMTEXT) too much, but I
added support for it, where it seemed straight-forward to me.

I get the feeling that the C code does too much, and that it should
delegate major parts of the processing to Lisp.  I am not yet sure how
to structure that, though.

Somebody suggested to automatically map all Unicode coding-systems to
the right one, utf-16le-dos.  If we actually want that, I'd suggest
doing it in the Lisp in (set-selection-coding-system) and not in C.  I
also wouldn't want to automatically map utf-8-dos, because it seems
clear and well-known to me that an 8-bit Unicode variant is not what
Windows thinks of as "Unicode", so when I say utf-8-dos, I expect that
to be respected here.  But that's just IMO.

I didn't do anything about the defaults yet.  That's because it's done
in Lisp until now, and I haven't studied that code yet.  To reiterate
the last discussion, the idea would be to set selection-coding-system
to utf-16le-dos on NT/W2K/XP and to cpXXXX on 9x/Me.

About the ASCII-only optimizations in w32select.c, these are now
enabled, except for the Unicode case in w32-get-clipboard-data, where
I would have to duplicate some 8-bit code as 16-bit code.  I didn't
feel like doing that just now.

Even with a small enhancement, the optimization still doesn't do very
much for setting the clipboard.  It *does* make a noticable difference
with *getting* data from the clipboard.  Ideally somebody could
profile w32-set-clipboard-data to see if the time is spent in Emacs or
in Windows' clipboard API.

Unicode on 9x: The only applications that I have on 98 that have
support for CF_UNICODETEXT are Wordpad and IE 6, they produce that
format themself and they also use it if somebody has posted it.  I
assume Word and other Office apps would also have it.  Both Wordpad
and IE need CF_TEXT to be present (or promised via delayed rendering)
to even enable pasting, so Unicode pasting has to wait until we
implement delayed rendering in Emacs.

We'll probably need documentation updates in some place. 

The patch also fixes two mostly benign bugs.

I assume that for this to go in, I need to sign papers, and I
understand that somebody needs to contact me about that from your

While I am still familiar with the internal of this module, I'd like
to tackle delayed rendering.  I am thinking of constructing a hidden
window for hanging the message handlers on (we have a regular message
loop running even with "emacs -nw", right?).  I know how to do that,
so it should not be too much of an effort.


2004-06-02  Benjamin Riefenstahl  <address@hidden>

        * w32select.c (cached_coding_system, codepage, lcid)
        (clipboard_type, DEFAULT_LCID, ANSICP, OEMCP): New static
        (cp_from_locale, enum_locale_callback, setup_parameters): New
        (Fw32_set_clipboard_data): Handle CF_UNICODETEXT and CF_LOCALE.
        Correct end-of-string handling on clipboard.
        (Fw32_get_clipboard_data): Handle CF_UNICODETEXT and CF_LOCALE.
        Correct end-of-string for last_clipboard_text.
        (Fx_selection_exists_p): Handle CF_UNICODETEXT.
        (syms_of_w32select): Init and register cached_coding_system.

Attachment: w32select.c.patch
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]