pdf-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pdf-devel] Object layer API


From: gerel
Subject: Re: [pdf-devel] Object layer API
Date: Wed, 18 Feb 2009 07:18:01 -0800 (PST)

 > Date: Wed, 18 Feb 2009 01:38:57 -0500
 > From: Michael Gold <address@hidden>
 > 
 > 
 > >  > For the standard headers, the client could check using the
 > >  > pdf_obj_doc_get_header you proposed, but that still seems like a
 > >  > low-level detail they shouldn't have to deal with.
 > >  >=20
 > >  > We could have separate functions for opening PDF or FDF, or a function
 > >  > that returns the type (PDF or FDF).
 > >=20
 > > Here again, client is ambigous to me. Though the procedure you mention,
 > > IMHO, is in the correct level of abstraction.
 > >=20
 > > Moreover the phrase 'low-level' depends on where you're looking it from, =
 > and
 > > although a procedure is defined on some API it's not enough reason for the
 > > user to call it.
 > 
 > True, but I'd still prefer to hide details when it's not useful to
 > expose them, and to keep them in the proper layer.
 > 
 > I don't think passing a header string on open/save provides a real
 > benefit.  It may seem like a generic way to handle different types of
 > files, but in reality the header won't be the only difference:
 >  - the encoding of names varies depending on the PDF version
 >  - the xref table is optional for FDF files
 >  - FDF files can't use indirect /Length values for streams
 > 
 > So the object layer needs to know the file type anyway (not just the
 > header string); and given this, it can choose an appropriate header when
 > saving, or detect an appropriate header when opening.

Now, that is an argument :-). So any ambiguity (e.g. none of those differences 
apply) is
resolved by the file type given by the client, if that's right, then I agree we
don't need the proc.

BTW, I believe this header content should be analyzed when we're opening an
existent file, we don't want to rely on file extensions.


 > >  > I don't really like the idea of the library creating temporary files on
 > >  > its own.  Opening a file in a library can cause security issues, for
 > >  > example:
 > >  >   http://udrepper.livejournal.com/20407.html
 > >  > (Linux 2.6.27 is needed to protect against this, and I'm sure there are
 > >  > operating systems without this feature.)
 > >=20
 > > Interesting security risk, but if we make all memory-based, how much memo=
 > ry
 > > will we need, on average, to edit a document ?
 > > Maybe we could provide this as an optional feature for the poor user with
 > > few MB of RAM.
 > 
 > I definitely wouldn't want to require all data to be stored in RAM.
 > The callback idea I suggested in my first mail would be OK for low-
 > memory systems, and would work as follows:
 >  - when creating a stream object (pdf_obj_stream_new), the client would
 >    provide a callback function instead of a pdf_stm_t
 >  - when saving a file, the object layer would execute this callback when
 >    it needed to write a stream
 >      - the callback would return an open pdf_stm_t
 >      - the object layer would read this until EOF, writing the filtered
 >        data to the output file; then it would close the stream
 > 

But, in the meantime (i.e. before the save), isn't all in RAM anyways ?  If it
is, I can't see the benefit of this idea in terms of RAM consumption.
I'd still define this feature as optional, we could use temp files (the legacy
method) or use the method you propose (a more secure method).

cheers,

-gerel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]