microdc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [microdc-devel] German Umlauts


From: Vladimir Chugunov
Subject: RE: [microdc-devel] German Umlauts
Date: Thu, 14 Dec 2006 23:47:30 +0300

Hi,

> first of all, i wanna to provide that *mixed* log :)
Well I'm waiting for it.

> 
> microdc2> set log
> connections    download       publicchat     upload
> debug          joinpart       searchresults
> 
> with hub_charset UTF-8 and filesystem_charset UTF-8.
First of all - you must never use UTF-8 as hub_charset value just because it 
has no sense due to protocol specification. This specification states the all 
character strings must be send in current codepage. Did you see Windows 
supporting UTF-8 encoding as default codepage? For example: you set the 
hub_charset to UTF-8, some Windows user send you the chat message containing 
the word Öffnen, of course according to spec the software send this word using 
current codepage CP1252 but microdc2 is trying to convert this string from 
UTF-8 charset to your terminal locale => because this string is not UTF-8 
encoded the transcoding subrotine fails and the microdc2 shows the message as 
is e.g. without conversion at all => assume your terminal locale is UTF-8 as 
well but this word cannot be shown in UTF-8 bacause contains invalid (from 
UTF-8 point of view) character Ö. So the hub_charset must be set to cp1252 
value for Germany as well as set to CP1251 for Russia. This doesn't mean that 
you have to specify CP1252 for your terminal locale.
Just to make our life a bit easy I'll make the UTF-8 recognition for chat 
messages in the next version. This helps us fighting against misconfigured 
unix-based (or too featured) clients that send the chat messages in UTF-8.

The second thing - the filesystem_charset must be set to the charset used for 
storing filenames on your local filesystem. For example samba prior version 3 
doesn't recognize UTF-8 encoding and stores filenames in CP865 or something 
like this. So be sure that all your tools are using the same UTF-8 encoding for 
filenames in case you set filesystem_charset variable to UTF-8 value.

> The second is, that if i change to CP1252 the hub_charset AND
> the filesystem_charset, nearly all "äöÜ" are right displayed, also in
> filenames, BUT the the Names in the hub are still trash, it 
> seems to me,
> that the names in the hub, dosnt react on that i change the charset, i
> get allways the same trash names, no matter if UTF-8 CP1250 CP1252.
> Wenn I use CP1252, which is better, cause there are more guys 
> that dont 
> use UTF-8 i get following:

=== skipped ===

> 
> well u see the thing i mean? if i look for users there i have a 
> [CA]F\303\211 guy, if i who hin, his nick is displayed corerct,
> or better said all execpt that nick in the "console mode" are 
> displayed
> correct. If i canged back to UTF-8 i saw only [CA]F\303\211 
> everywhere.
> I think in "that area" there is the "internationl support" missing?
> Because that part of the names, dosnt hear to hub_charset of
> filesystem_charset.
> But i think u can tell me more, and find maybe a reason :)

The 0.15.4 version as well as all previous doesn't re-read user list from a hub 
when hub_charset is changed - so all bad converted nicks remain until the 
session is disconnected - but any newly retrieved information is shown 
correctly. Try to set hub_charset variable to CP1252 value *before* connecting 
to the hub.

Regards, Vladimir.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]