Re: Unicode support for the MS Windows clipboard

From: Benjamin Riefenstahl
Subject: Re: Unicode support for the MS Windows clipboard
Date: Mon, 26 Jul 2004 21:17:13 +0200
User-agent: Gnus/5.1001 (Gnus v5.10.1) Emacs/21.3.50 (gnu/linux)

Hi all,

Benjamin Riefenstahl <address@hidden> writes:
> While I am still familiar with the internal of this module, I'd like
> to tackle delayed rendering.

Done that, see attached expanded patch. 

I consider this done from a functional POV, but of course feel free to
suggest changes.


Wrapping the Lisp evaluation and en/de-coding in appropriate error
handlers for asynchronous execution is a bit of a pain.  Is there
documentation how this *should* work or was this kind of scenario just
never considered before?

It seems that the GC macros are no-ops on Windows?  If so these should
be removed, I guess.  If not, other variables may need the same

w32.c has a note that atexit() is broken.  atexit() works here with
Mingw.  The note in w32.c is probably about a static build, which
Mingw doesn't support.  Anyway, the alternative implementation using
signal(SIGABRT) as in w32.c doesn't work here.

About my earlier notes and discussion:

> I am thinking of constructing a hidden window for hanging the
> message handlers on [...]

That's what I have done here.  The FRAME parameter for the Lisp
functions is ignored now.  This parameter is never passed in the core
code Lisp anyway, I believe.  If the change in interface is
acceptable, the parameter could be removed altogether.

> The C code constructs it's own cpXXXX-dos coding system names in one
> place.  I would like to check that the constructed coding system
> actually exists, but I don't know how yet.

That check is actually made by the code that uses the coding system.
I had missed that before.

> I get the feeling that the C code does too much, and that it should
> delegate major parts of the processing to Lisp.  I am not yet sure
> how to structure that, though.

After thinking about it, making more interfaces between Lisp and C
doesn't seem too usefull.  Windows is pretty firm about what is what
here, so there is not much to customize.  Via the
selection-coding-system variable, the algorithm is already more
tinkerable from the Lisp side than strictly necessary.

> Somebody suggested to automatically map all Unicode coding-systems
> to the right one, utf-16le-dos.

The uncertainty about the right Unicode coding-system to use seems to
me to be a general problem with the naming of the Unicode
coding-systems, not with w32select.c.  The names of the coding-systems
could be clearer.  At the moment (in CVS) we have "utf-16le" which
acts like be "utf-16-le-without-signature" and "utf-16-le" which is
the same as "utf-16-le-with-signature".

OTOH maybe *appropriate* aliases could be added to make some
selections more obvious.  E.g. on Windows we might want a simple alias
"unicode" for what is now "utf-16le", because that is what Windows
users and programmers expect when they hear "Unicode" encoding.

> I didn't do anything about the defaults yet.

I put the dynamic initializations into the C code now and dropped the
line of Lisp code that sets this in mule-cmds.el.  This was mostly
because I couldn't find Lisp variables on which to base the decisions.
It also keeps things together this way.


(Note: I can of course shorten the ChangeLog, if necessary.)

2004-07-26  Benjamin Riefenstahl  <address@hidden>

        * w32select.c: Drop last_clipboard_text and related code, keep
        track of ownership via clipboard_owner instead.  Drop old #if0

        (clipboard_owner, modifying_clipboard, cfg_coding_system)
        (cfg_codepage, cfg_lcid, cfg_clipboard_type, current_text)
        (current_coding_system, current_requires_encoding)
        (current_num_nls, current_clipboard_type, current_lcid): New
        static variables.

        (convert_to_handle_as_ascii, convert_to_handle_as_coded)
        (render, render_all, run_protected, lisp_error_handler)
        (owner_callback, create_owner, atexit_callback, setup_config)
        (enum_locale_callback, cp_from_locale, coding_from_cp)
        (globals_of_w32select): New local functions.

        (Fw32_set_clipboard_data): Ignore parameter FRAME, use
        clipboard_owner instead.  Use delayed rendering and provide
        all text formats.  Provide CF_LOCALE if necessary.

        (Fw32_get_clipboard_data): Handle CF_UNICODETEXT and
        CF_LOCALE.  Fall back to CF_TEXT, if CF_UNICODETEXT is not
        available.  Force DOS line-ends for decoding. 

        (Fx_selection_exists_p): Handle CF_UNICODETEXT.

        (syms_of_w32select): Init and register new variables.

        * w32.h: Add prototype for globals_of_w32select.  Make the
        neighboring K&R declarations into prototypes, too.

        * emacs.c: Include w32.h to get function prototypes.
        (main): Call globals_of_w32select.

        * mule-cmds.el (set-locale-environment): Remove call to
        set-selection-coding-system on Windows.

