[Okular-devel] OCR Tool for Okular

Wed Apr 3 17:18:34 UTC 2013

El Dimecres, 3 d'abril de 2013, a les 14:20:35, Anıl Özbek va escriure:
> Hi,

Hi

> Last week, I've started to write a simple OCR tool for Okular.
> Generally it received good response from KDE users [1-3].
> 
> What do you think about adding such a tool to Okular? Is it possible?
> If possible, I'd be happy to help as far as I can do. But I would like
> to say that I'm not experienced in the KDE/Qt development.
> 
> Currently my code (which mostly copy/paste from other projects) take
> an image part from active document and save it to os's temp dir. Then
> run a particular OCR app's executable file (for now only Tesseract)
> and convert image to text file. Finally code open the text file and
> copy its content to clipboard. And after all, the temporary files are
> deleted.
> 
> I think before going any further it would be better to clarify some
> issues that I encountered.
> 
> 
> API vs Executable
> -------------------
> Which one would be better to use? It's easier to use the executable
> file. But using API seems a more right approach. As far as I see
> Tesseract [4] and Cuneiform [5] provide API but I don't know about
> other OCR software.
> 
> Maybe instead of trying to give support to more than one OCR software
> we can choose just a default one. But it will restrict the users.
> 
> If we use API, Okular will link to OCR software libraries and this
> means more dependencies for Okular package. If we use executable, we
> can check executable file before running it and if it's not installed
> we can show a info message to user which tells something like that:
> "additional packages must be installed to use this feature".
> 
> If we choose API way these [6-9] way help.
> 
> 
> OCR Output's Accuracy
> -----------------------
> OCR performance isn't well enough (at least for comics) for now. There
> is almost 50% success. My current code use image directly from comics,
> may be it would be nice to convert image first black and white or
> 2-bit and apply some other image operations to make letters clearer.
> Do you have any suggestions about this?
> 
> 
> Icon for OCR Tool
> -------------------
> Currently I used scanner icon from Oxygen [10] but if we have a better
> option we can use it.
> 
> 
> Document Language
> -------------------
> To give OCR software correct parameters we must know document
> language. For now Okular can't determine language of opened documents
> [11]. Until this feature implemented we can add a new section to
> Okular Configurations for OCR tool. Users can select language for OCR
> process from here as well as which OCR software will be used.

Why should this be a part of Okular and not a separate binary? I can imagine 
millions of other places i'd like to have OCR on. What's the benefit of it 
being Okular-only?

Cheers,
  Albert

> 
> 
> Links
> -------
> [1] http://wklej.org/id/995982/
> [2] http://www.youtube.com/watch?v=duSTyByIPLc
> [3] https://plus.google.com/113435503145887565355/posts/RqzC3hMcGcd
> [4] https://code.google.com/p/tesseract-ocr/
> [5] https://launchpad.net/cuneiform-linux
> [6]
> https://raw.github.com/ruediger/VobSub2SRT/master/CMakeModules/FindTesserac
> t.cmake [7]
> https://raw.github.com/ck1125/sikuli/master/cmake_modules/FindTesseract.cma
> ke [8]
> https://projects.kde.org/projects/playground/libs/kolena/repository/revisio
> ns/master/entry/cmake/modules/FindTesseract.cmake [9]
> https://raw.github.com/uliss/quneiform_tests/master/cmake/FindCuneiform.cma
> ke [10] http://i.imgur.com/xn8iyDw.png
> [11] https://bugs.kde.org/show_bug.cgi?id=317486
> 
> 
> Regards,
> --
> Anıl Özbek
> _______________________________________________
> Okular-devel mailing list
> Okular-devel at kde.org
> https://mail.kde.org/mailman/listinfo/okular-devel