[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[libextractor] Solaris, iconv and libextractor
From: |
Michał Kowalczuk |
Subject: |
[libextractor] Solaris, iconv and libextractor |
Date: |
Tue, 11 Apr 2006 16:31:00 +0200 |
User-agent: |
Mail/News 1.5 (X11/20060122) |
I had another problem under Solaris. Sun libiconv doesn't support conversion
from UNICODE to UTF-8. So convertToUtf8(u, 2, "UNICODE") invoked from
printInfoString() in src/plugins/pdf/pdfextractor.cc fails. It has 2-byte,
non-zero terminated string on input, so strdup() (called on iconv failure)
from convertToUtf8() returns junk. As I checked, conversion from UTF-16 gives
the same result under Linux (GNU libiconv) as conversion from UNICODE (I'm not
sure if it is equivalent). Moreover, what is important to me, Sun libiconv
supports convertion from UTF-16 to UTF-8.
I don't really understand, why in printInfoString() 2-byte buffer is used for
conversion. Isn't it easier to pass the whole (s+2), which is (len-2) bytes
long, to convertToUtf8()? Less mallocs, less function calls.
The same two things applies also to printInfoDate() in the same file.
--
greetings,
Michał Kowalczuk
Wirtualna Polska S.A.
- [libextractor] Solaris, iconv and libextractor,
Michał Kowalczuk <=