discuss-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ANN: Vindaloo 0.2 & PopplerKit


From: Stefan Kleine Stegemann
Subject: Re: ANN: Vindaloo 0.2 & PopplerKit
Date: Thu, 21 Jul 2005 18:16:28 +0200

>   I am interested in using PopplerKit to extract text from PDF.
>   I look at the headers and there is no support for that.
>   It would be nice that PopplerKit offer this kind of functions,
>   like extracting text from document or each page, extracting outline, etc.

The functionality you're looking for is planned but not there right
now. However, at
least text extraction it is not very difficult to implement and I can
give you that,
say, by the end of next week. Outline will take a bit longer.

>   Actually I am porting LuceneKit, which is a search engine like
> Google or Spotlight,

Great, is this related to the apache lucene thing? Does it also work on OSX?

>   because I need to search across a collectin of PDF files.
>   PopplerKit just comes at the right time while I am looking for a
> library to extract text from PDF.
>   Since you mention searching in the blog, you could look at LuceneKit.
>   http://www.dromasoftware.com/etoile/mediawiki/index.php?title=LuceneKit
>   But it is designed to search across document.
>   If you just want to search within a document, it may be overkill.

I think I will start with a simple "brute-force" search first to have
something usable.
But for the long run, I'd like to have a "find-while-you-type" feature and maybe
LuceneKit could help to make this become reality.

greets
Stefan

-- 
Stefan Kleine Stegemann
Mail: stefankst at gmail.com
Home: http://rzserv2.fhnon.de/~lg017420
Weblog: http://stefankst.blogspot.com/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]