[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: String handling in xwidget primitives

From: Eli Zaretskii
Subject: Re: String handling in xwidget primitives
Date: Fri, 29 Jan 2016 22:25:15 +0200

> From: address@hidden
> Date: Fri, 29 Jan 2016 20:25:21 +0100
> I briefly tested this:
> (xwidget-webkit-execute-script (xwidget-at 0) "alert('𝌆')")
> where 𝌆 is some kind of unicode char i stole from
> https://mathiasbynens.be/notes/javascript-encoding
> this page seems to indicate utf-16 is used.

I've seen such claims.  But they cannot be true, since if they were,
we couldn't have passed pure ASCII strings to those interfaces without
triggering weird errors: each ASCII character takes 2 bytes in UTF-16,
not one.

I think UTF-16 is used internally to represent strings, but the script
itself should not be in UTF-16.  I think it should be either in UTF-8
(and then requires a BOM), or it should include the charset= metadata
to indicate its encoding.

> I executed the code in a buffer containing a webkit instance, and the
> char showed up in an alert box originating from the wekit instance.
> This doesnt actually prove anything, but it does seem to show that in my
> case on my machine and environment, at least something goes right.

Sheer luck: you just didn't bump into all those subtleties which make
the internal representation of strings in Emacs be a superset of
UTF-8, but not exactly UTF-8.

> If we do need to encode, do you know some part of the emacs src i can
> see which functions to use?

It depends how we need to encode.  In general,
code_convert_string_norecord is the most frequently used function in
these cases.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]