bug-parted
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#15597: bug-parted Digest, Vol 131, Issue 9


From: Rod Smith
Subject: bug#15597: bug-parted Digest, Vol 131, Issue 9
Date: Sat, 12 Oct 2013 12:33:12 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130925 Thunderbird/17.0.9

On 10/12/2013 12:01 PM, Phillip Susi <address@hidden> wrote:

The gpt partition table has 16 bit characters for the name, which I
assume are supposed to be UTF-16, but the bloody uefi standard is moot
on the subject.

The standard says they're "strings," and the default for strings in UEFI is UTF-16LE/UCS-2.

Currently parted simply decimates the characters,
throwing out the upper 8 bits.  This corrupts characters that aren't
simple ascii, and at some later point, strlist.c calls mbstrtowcs(),
which chokes on the corrupt name causing parted to bail out with
"Error during translation".

I think that gpt.c needs to translate the UTF-16 to the native
multibyte encoding, but I have no idea how to do that.  The C standard
conversion functions all seem to use the current locale and don't have
a way to override it if you know this string is in UTF-16 ( and maybe
the current locale is UTF-8 ).

I agree with you. I haven't studied the parted code on this score, so I don't have any specific suggestions for how to do it in parted. I can offer my experiences with doing it in GPT fdisk (http://www.rodsbooks.com/gdisk/), though: I used libicu (http://site.icu-project.org/) to do the translation. This seems to work pretty well -- at least, it produces results that are inter-operable with what Apple's tools do. You can check the gdisk source code, and particularly the gptpart.cc file, to see how gdisk does it. Search for "UnicodeString" to find what it does. It's been a while since I added libicu support, and I haven't made many changes to it since then, so I don't recall every detail of what I did. I seem to recall that it wasn't really very hard, but I did need to change quite a few output functions to use the libicu calls.

FWIW, when I added libicu support to gdisk, I kept the option to compile without libicu, in which case gdisk mangles non-ASCII characters in much the way parted does. Thus, you'll see both sets of code in gdisk. As a practical matter, libicu is a rather large library, and some developers of small emergency disks don't want to include it, so keeping the option to not use libicu is worthwhile.

Note that some values are invalid even with libicu, so there's a possibility that you'll run into error conditions, whether using libicu or not. Obviously, sane error handling is better than having the code bail out.

--
Rod Smith
address@hidden
http://www.rodsbooks.com





reply via email to

[Prev in Thread] Current Thread [Next in Thread]