[Okular-devel] Export from pdf to txt, invoking from the command line

filippo di natale f_dn at hotmail.it
Thu Nov 10 13:26:26 UTC 2011


Hi,
yes I tried before pdftotext but the results of Okular are much better for my needs (parsing of pdf documents).
I need to parse "csv" or "fixed length" like documents that are unfortunately in pdf format, if anyone has any suggestion on how to parse them without translating them to text...


> From: aacid at kde.org
> To: okular-devel at kde.org
> Date: Thu, 10 Nov 2011 13:45:34 +0100
> Subject: Re: [Okular-devel] Export from pdf to txt,	invoking from the command line
> 
> A Dijous, 10 de novembre de 2011, filippo di natale vàreu escriure:
> > Hi,
> > I like very much how Okular exports pdf to txt keeping the correct spacing
> > (doing the same with acrobat on windows gave no such clean results). Given
> > that I cannot invoke okular from the command line to make a pdf to txt
> > conversion (or so I seem to understand) which library okular uses to do its
> > pdf to txt conversion? Or, if it is developed internally in the project,
> > can it be used stand alone to make a command line pdf to txt converter, and
> > which part of the source code should I look ? Thanks,
> 
> No, okular does not have a export to text command line. It should not be 
> extremely difficult, but we do not have it yet.
> 
> You can try to use pdftotext command line, it is not what okular uses but it 
> is known to be good enough in some cases.
> 
> Albert
> 
> > 
> > Filippo
> _______________________________________________
> Okular-devel mailing list
> Okular-devel at kde.org
> https://mail.kde.org/mailman/listinfo/okular-devel
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/okular-devel/attachments/20111110/235d7a79/attachment.html>


More information about the Okular-devel mailing list