[Okular-devel] [okular] [Bug 342504] Add possibility to copy formulas as MATHML/Latex Math/OO Math
Yuri Chornoivan
yurchor at ukr.net
Mon Jan 5 20:36:31 UTC 2015
https://bugs.kde.org/show_bug.cgi?id=342504
Yuri Chornoivan <yurchor at ukr.net> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |yurchor at ukr.net
--- Comment #2 from Yuri Chornoivan <yurchor at ukr.net> ---
(In reply to Christoph Feck from comment #1)
> Is there any other software able to extract formulas from PDF? To me it
> looks like a very hard problem, as soon as the formulas use multiple levels
> of text (fractions etc.)
MaxTract (development canceled) can do the extraction directly.
http://www.cs.bham.ac.uk/research/groupings/reasoning/sdag/maxtract.php
Infty Reader can do it using OCR.
Some thoughts on the problem can be found here (my tests confirm the
conclusions of this paper and nothing seems changed from 2011):
http://www.cs.bham.ac.uk/~aps/research/papers/pdf/BaSeSoSu-ICDAR11-ComparingApproachesToMathematicalDocumentAnalysisFromPDF.pdf
IMHO, it is hard to expect that free OCR engines like Ocropus/Tesseract can
solve the problem in the nearest future. At least, I failed to train Tesseract
in recognition of even rather simple formulas.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the Okular-devel
mailing list