[Okular-devel] Regarding okular generators

Albert Astals Cid aacid at kde.org
Sun Dec 30 14:28:07 UTC 2012


El Diumenge, 30 de desembre de 2012, a les 01:30:49, Jaydeep Solanki va 
escriure:
> On Sun, Dec 30, 2012 at 12:50 AM, Albert Astals Cid <aacid at kde.org> wrote:
> > El Diumenge, 30 de desembre de 2012, a les 00:30:41, Jaydeep Solanki va
> > 
> > escriure:
> > > I'm aware about Textpage algorithms that regardless of the generator,
> > 
> > makes
> > 
> > > text selection work properly in column layout.
> > > I'm not asking it for okular, I'm asking for personal use, as I was
> > 
> > trying
> > 
> > > out Poppler, I made a small app that currently can open pdf files, while
> > 
> > I
> > 
> > > was working on the text selection in column layout, I thought that if
> > > all
> > > the libs generate text in the correct order then there is no need for
> > > coding the algorithm to arrange text. In future I'm planning to add
> > 
> > support
> > 
> > > for other formats, so I need to know if the algorithm is needed.
> > 
> > The thing is, why write the column sorting algorithm in each and every of
> > the
> > libraries if you can have it just in one place?
> 
> yes you are absolutely correct, but I was thinking to omit the algorithm
> completely. Because if all the libs generate text in proper order then that
> can be used for helping the selection, instead of an algorithm.
> I have found that OCRopus is used in Okular, I didn't confirm, but found it
> somewhere written on internet. (correct me if I'm wrong).

You are wrong, don't trust the internet ;-)

Cheers,
  Albert

> OCRopus uses image processing which uses a large amount of computation
> resources.
> So the core reason to omit the algorithm is to save large computation.
> 
> > Cheers,
> > 
> >   Albert
> >   
> > > Jaydeep
> > > 
> > > On Sat, Dec 29, 2012 at 11:22 PM, Albert Astals Cid <aacid at kde.org>
> > 
> > wrote:
> > > > El Dissabte, 29 de desembre de 2012, a les 22:56:12, Jaydeep Solanki
> > > > va
> > > > 
> > > > escriure:
> > > > > As you might be knowing that Poppler::Page::textList() generates
> > 
> > text in
> > 
> > > > > the correct order (i.e. left to right). Now poppler not only
> > 
> > generates
> > 
> > > > > it
> > > > > in the correct order but it also considers the layout, for example
> > 
> > in a
> > 
> > > > two
> > > > 
> > > > > column document, it follows the column layout while indexing the
> > 
> > text.
> > 
> > > > > [image: Inline image 2]
> > > > > 
> > > > > just of example consider the image above, see the selection, poppler
> > > > > doesn't generate textList() in that order, it generates the
> > 
> > textList()
> > 
> > > > > as
> > > > > shown in the below image,
> > > > > 
> > > > > [image: Inline image 3]
> > > > > So my question is do all the libraries that the generators of okular
> > > > > use,
> > > > > generate text in the proper order considering the layout ?
> > > > 
> > > > That doesn't matter, Textpage algorithms "should" correctly arrange
> > 
> > text
> > 
> > > > in
> > > > columns (as correctly as the algorithm in there works).
> > > > 
> > > > Have you found any particular problem?
> > > > 
> > > > Albert
> > > > _______________________________________________
> > > > Okular-devel mailing list
> > > > Okular-devel at kde.org
> > > > https://mail.kde.org/mailman/listinfo/okular-devel
> > 
> > _______________________________________________
> > Okular-devel mailing list
> > Okular-devel at kde.org
> > https://mail.kde.org/mailman/listinfo/okular-devel


More information about the Okular-devel mailing list