emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] orgmode and pdf


From: Jambunathan K
Subject: Re: [O] orgmode and pdf
Date: Tue, 24 Jul 2012 17:53:22 +0530
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1 (windows-nt)

address@hidden writes:

> Hi list.
> I try to make a workflow to mine data from pdfs into org mode.
> I prefer to read in emacs, since I have fast dictionary lookup in it and
> many other things.
> There are two tools I think useful for conversion of pdfs into txt:
> cuneiform - to extract text, and pdfimages for image extraction.
> Cuneiform is better then other text extractors (what I have tried) in
> handling two columned
> pdfs.

PdfEdit seems interesting as well.

http://sourceforge.net/projects/pdfedit
http://www.cs.unb.ca/~bremner/blog/posts/pdf2text/

ps: I have no experience using PdfEdit or how it fares wrt images and
captions.

> A pdf as split to pages and each of them processed separateddly
> Using this two programs and some scripting I believe it is possible to
> convert pdf in org file. However there are two issues I would like to
> solve.
> 1) Is there any way to extract  figure captions from a pdf?
> 2) I have no solution for formulas and Greek letters. The only way to
> handle it would be
> to consult an image of the page.
> Any suggestions about it? Have somebody tried something similar. 
> Thanks.
> Petro.
>
>
>
>
>

-- 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]