libextractor
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [libextractor] Current version seems to lack output


From: Nils Durner
Subject: Re: [libextractor] Current version seems to lack output
Date: Thu, 27 Jul 2006 07:44:04 +0200
User-agent: Thunderbird 1.5.0.4 (Windows/20060516)

Hello,

> Comparing the output of libextractor-0.5.10-12 (SuSE-10.1) with the
> output of
> am RPM of libextractor-0.5.14 I found that 0.5.10 extracts a lot of
> meta data
> form PDF ond Word-DOC, but 0.5.14 doesn't.
Yes, that's because newer versions of libextractor don't use xpdf for
PDF extractor because of its bad security history (and because it'll be
droppped from Debian for the same reason).
libextractor now uses a builtin PDF extractor that lacks some features.
If you want to xpdf anyway, build libextractor from source and add
    --enable-xpdf
to the arguments of ./configure

The OLE2 extractor was switched to use libgsf.
libgsf has more problems that extracting less information than before
(crashes if an application loads and unloads it), so I'd be happy to
remove it if someone provided a better solution.


Regards,

Nils Durner




reply via email to

[Prev in Thread] Current Thread [Next in Thread]