|
From: | Tiphaine Turpin |
Subject: | bug#10919: emacs-mule/utf-8 difference |
Date: | Thu, 01 Mar 2012 16:39:57 +0100 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111109 Thunderbird/3.1.16 |
Hi, I have a problem regarding coding systems:I'm using process-send-string to send substrings of a buffer through a socket, after setting the process encoding and decoding systems to emacs-mule. I expect the number of bytes written to match the byte-length of the substring as obtained by position-bytes, since the specification of position-bytes in emacs-devel is to always work with the emacs-mule encoding. From emacs-devel:
"The byte sequence of a buffer after decoded is always in emacs-mule (in emacs-unicode-2 branch, it's utf-8). So, changing buffer-file-coding-system or any other coding-system-related variables doesn't affects position-bytes."
However, this is not the case with 3bytes utf8 characters: position-bytes counts them as 3 bytes, but process-send-string wirtes 4 bytes.
Setting the process coding systems for the socket to utf-8 solves the problem, but I don't think it will with other coding systems, even if I used buffer-file-coding-system instead, since position-bytes does not use it.
What is the real expected behavior of these things, and how to make this correct ?
Regards, Tiphaine Turpin
[Prev in Thread] | Current Thread | [Next in Thread] |