[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Koha-devel] UTF-8
From: |
Dorian Meid |
Subject: |
Re: [Koha-devel] UTF-8 |
Date: |
Mon, 12 Mar 2007 05:26:54 +0100 |
Am 12.03.2007 um 03:19 schrieb Thomas Dukleth:
TARGET ISSUES.
Are you certain that your Z39.50 target is returning records with
UTF-8
encoding? If you supply the connection parameters and a test
search for
records which you believe are problematic, then I can test the target
myself.
BYTE CODES NEEDED FOR A RELIABLE CHECK.
Because I don't have another z39.50 client I use the scripts we have,
namely z3950/search.pl
I simply store the retreived marc data with:
open(MARCFILE, ">:raw", "result$i.MARC");
print MARCFILE $marcdata;
close MARCFILE;
starting at line 174.
I also tried:
open(MARCFILE, ">:utf8", "result$i.MARC");
when I use a server which claims to send utf-8 i get:
with raw-print: 75 CC 88 C3
with utf8-print: 00 75 03 08 FF FD
both should be "üß" (ü ß)
when I use a server which uses autonegotiation of the charset I get a
correct encoded latin-1 record
the ü is ok but FF FD for ß is definitely wrong.
It seems to me that this encoding thing is a real pain and I'm rather
new to it.
In this example I used the host "z3950.gbv.de" port "20012" database
"gvk" user "999" password "abc".
I searched for the ISBN "3-552-06027-8"
The title should be "Die Süße des Lebens"
Another strange point is that the field ends at the ß character ("Die
SüÃ"), but the original title is much longer (maybe indicates a
faulty record, but it's the same with all records containing ß)
Unfortunately I don't know any other server, which provides utf8
encoded MARC21 data. Maybe somebody can tell me one and a sample
search with the results to expect.
Sadly, Firefox is a poor performer for transmitting data in UTF-8.
Yep, I checked my browser here: http://www.fileformat.info/info/
unicode/utf8test.htm
Dorian Meid