[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: uuencode: multi-bytes char in remote file name contains bytes >0x80
From: |
Eric Blake |
Subject: |
Re: uuencode: multi-bytes char in remote file name contains bytes >0x80 |
Date: |
Fri, 08 Jul 2011 17:25:11 -0600 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.10 |
On 07/08/2011 05:11 PM, Bruce Korb wrote:
>
> Hi Eric(s),
>
> This mojibake stuff is mumbo jumbo to me.
mojibake is what happens when you interpret bytes from one character set
as though they were characters in another character set, and then
convert them according to that wrong assumption. A common symptom is
that when you view UTF-8 text with a unibyte Latin-1 charset, each
multibyte UTF-8 character appears as multiple 8-bit random characters
from Latin-1.
>
> I looked into the iconv(3p) function a bit and it seems to be dependent
> upon some characters strings that are different from what one might
> put in LANG or LC_ALL or LC_NAME environment variables. Those guys
> take things like EN_us, for example, not character set specifications.
> So how am I to know what the current character set it if all I know is
> CN_hk, for example?
I suggest using the gnulib module localcharset which provides the
function locale_charset(). That should give an answer which is safe to
pass to iconv() as one of the two charsets, with "utf-8" being the other
charset.
--
Eric Blake address@hidden +1-801-349-2682
Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
- uuencode: multi-bytes char in remote file name contains bytes >0x80, ��叁, 2011/07/03
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruce Korb, 2011/07/03
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Eric, 2011/07/03
- Message not available
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Eric, 2011/07/06
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruce Korb, 2011/07/06
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruno Haible, 2011/07/06
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruce Korb, 2011/07/06
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruno Haible, 2011/07/06
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Eric Blake, 2011/07/06
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruce Korb, 2011/07/08
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80,
Eric Blake <=
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Eli Zaretskii, 2011/07/09
- Re: file names encoding on Windows, Bruno Haible, 2011/07/09
- Re: file names encoding on Windows, Eli Zaretskii, 2011/07/09
- Re: file names encoding on Windows, Bruce Korb, 2011/07/09
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruno Haible, 2011/07/08
Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruce Korb, 2011/07/03