pdf-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [pdf-devel] Proposal of API for the Encoded Text module


From: Leonard Rosenthol
Subject: RE: [pdf-devel] Proposal of API for the Encoded Text module
Date: Mon, 28 Jan 2008 06:48:19 -0800

PDF Names have been able to be encoded in host encodings, but by doing
so you can NOT expect them to work correctly.  For example, some
producers tried to encode Japanese font names in SJIS, which could be
done (using the #-escape mechanism) but wouldn't be processed by a
conforming reader.

ISO 32000 (the ISO version of PDF 1.7) does not provide compliance for
anything other than PDDoc and UTF8.

Leonard

-----Original Message-----
From: Aleksander Morgado [mailto:address@hidden 
Sent: Monday, January 28, 2008 9:43 AM
To: Leonard Rosenthol
Cc: address@hidden
Subject: Re: [pdf-devel] Proposal of API for the Encoded Text module


> Just one other thing to remember is that PDF Names are either a subset
> of PDDocEncoding _OR_ they are valid UTF8 strings.  (See PDFRef 1.7,
> 3.2.4).
> 
> Leonard

In fact, I think that this is one of the reasons to have the UTF-8 
built-in support in the library. I suppose that PDF Name and PDF String 
types in the `object library' will be based on the pdf_text_t from the 
`base library', which directly supports UTF-8.

Anyway, are you sure that these two encodings are the only ones allowed 
for PDF Names? In older Acrobat versions PDF Names could be encoded in 
specific `host encodings', like Shift-JIS or Big Five for Asian 
languages (PDFRef 1.7, H.3).

If this is the case, how can we detect the encoding being used in the 
PDF Name? For example, a PDF with a japanese encoding for PDF names 
which is read in a US-localized system... What the present text module 
API provides so far is a function to detect the best encoding for a 
given Unicode string, and not a function to detect the encoding being 
used in a given multibyte string. Something like this could also be 
needed, but I am not sure if this is possible to implement.

--
Aleksander




reply via email to

[Prev in Thread] Current Thread [Next in Thread]