getpwent, user-full-name and utf-8

From: David Kastrup
Subject: getpwent, user-full-name and utf-8
Date: Wed, 21 Mar 2007 10:58:08 +0100
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)


user-full-name is set using getpwentry without decoding the resulting
byte string at all.

The manual page of getpwent does not mention any encoding of
/etc/passwd, neither does that of /etc/passwd.

It is a safe bet, however, that /etc/passwd is not encoded in

Since different users may use different language environments, I
propose that we decode the results from getpwent according to utf-8.

There will likely be similar problems with other system functions
(name server lookup?).  emacs-mule certainly is not the right answer
to the encoding problem.  And the problem will persist with
emacs-unicode2 as well since there is a difference between illegal
byte sequences and decoded illegal byte sequences.

I propose that we bite the bullet, assume a fixed external system
encoding of utf-8 for such strings, and recode accordingly.

David Kastrup

