[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] Does "mtn status" work correctly only with UTF-8 lo
From: |
Tomohisa Tanaka |
Subject: |
Re: [Monotone-devel] Does "mtn status" work correctly only with UTF-8 locales? |
Date: |
Tue, 23 Sep 2008 08:29:37 +0900 |
> Oh, sorry for that. Unfortunately the bug interface there is not used
> by many developers. The best place to report bugs is probably still
> the mailing list.
I see. Thank you.
> After diving (yet again) through some of the more obscure parts of
> monotone's source code (yeah, the vocab_* files) I have to admit that
> almost everything about the "external" type is a lie. In the end this
> just wraps a std::string, so no whatsoever conversion / safety takes
> place, you could also just directly print out the std::string and
> would receive the same result.
You are right. Both "utf8" and "external" classes are defined in
vocab_terms.hh as follows. Each class is just a wrapper of
std::string class.
| vocab_terms.hh:
| ...
| 14 ATOMIC_NOVERIFY(external); // ...
| 15 ATOMIC_NOVERIFY(utf8); // ...
| ...
The actual conversion is performed only by the methods defined in
charset.hh (e.g. utf8_to_system_best_effort() method). To ensure the
type safety in that mechanism, we must specify a correct argument to
the constructor of utf8() or external(), i.e. we must give a string
encoded with UTF-8 to utf8(), and a string encoded with user's locale
to external().
> I have found another place where the same bug exists - mtn log. But
> its not obvious there, because it was forgotten to translate certain
> UI phrases.
The same bug you mean is the following code? If so, you are right,
but this is related rather to "mtn commit" than to "mtn log".
| utf8 branch_comment = utf8((F("branch \"%s\"\n\n") % branchname).str());
I know one more. In the following code, "magic_line" is not encoded
with UTF-8, but the constructor utf8() takes it. ("user_log_message"
is correctly encoded.)
| 118 string magic_line = _("...");
| ...
| 136 user_log = utf8( magic_line + "\n" + user_log_message());
Regards,
Tomohisa Tanaka
2008/9/22 Thomas Keller <address@hidden>:
> Tomohisa Tanaka schrieb:
>> Dear Thomas,
>>
>> Thank you for your reply. I have already reported the problem and the
>> patch in https://savannah.nongnu.org/bugs/?21870, but nobody fixes it.
>
> Oh, sorry for that. Unfortunately the bug interface there is not used by
> many developers. The best place to report bugs is probably still the
> mailing list.
>
>>> Smells like a double-encoding bug,
>>
>> I also think so. The method F() and the operator % return an
>> i18n_format instance, which has the string encoded with user's locale
>> (since it is the string the gettext() returned). So, the method
>> i18n_format::str() does not return an UTF-8 string. However, the
>> constructor utf8() takes a string encoded with user's locale as the
>> argument, as follows.
>>
>> | 47 string out;
>> |...
>> | 51 out += (F("Current branch: %s") % branch).str() += '\n';
>> |...
>> | 100 summary = utf8(out);
>>
>> I think the constructor external() must be used for "out". Please see
>> the sample fix in my report.
>
> After diving (yet again) through some of the more obscure parts of
> monotone's source code (yeah, the vocab_* files) I have to admit that
> almost everything about the "external" type is a lie. In the end this
> just wraps a std::string, so no whatsoever conversion / safety takes
> place, you could also just directly print out the std::string and would
> receive the same result.
>
> I have found another place where the same bug exists - mtn log. But its
> not obvious there, because it was forgotten to translate certain UI
> phrases. I got around and could compile my changes, but I have yet to
> test one certain use case: if the changes affect (in any way) the output
> of internal, normalized paths (which are always utf-8). More on that
> tomorrow.
>
> Thomas.
>
> --
> GPG-Key 0x160D1092 | address@hidden | http://thomaskeller.biz
> Please note that according to the EU law on data retention, information
> on every electronic information exchange might be retained for a period
> of six months or longer: http://www.vorratsdatenspeicherung.de/?lang=en
>
>
> _______________________________________________
> Monotone-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/monotone-devel
>
>
--
Tomohisa Tanaka
address@hidden