OCR using Tesseract 0.1

Donatas Glodenis dgvirtual at akl.lt
Tue Mar 9 12:29:13 CET 2010

Name: OCR using Tesseract
Version: 0.1
Type: KDE Service Menu
Depend: KDE 4.x
License: GPL
More Info:

 This Dolphin/Konqueror service menu is compatible
with KDE 4 and will give you a possibility to OCR
documents conveniently in your file manager

This is a very simple program. It OCR\'s a
document and puts it into a file that has the same
name as the OCRed image file but with a txt

For the menu to be visible and have basic
functionality (OCR tif files) you have to have
tesseract-ocr installed and in your path, as well
as the desired language packages. (The menu is
tested against tesseract-ocr v. 2.03 and 2.04).

To be able to OCR png and jpeg images you have to
have imagemagick installed. To be able to OCR pdf
file you have to have ghostscript installed.

INSTALLATION: see file readme.txt in the archive. 


 – The menu cannot handle filenames with spaces
(though it 
   tolerates directory names with spaces). No
warning is given.

 – If the working directory contains a file with
a name of 
   the file to be OCRed, that has an extension
\"tif\" or \"txt\", it 
   will be overwritten or deleted (e.g., if the
file to be OCRed 
   is named foobar.tif, foobar.txt will be
overwritten; in case of 
   foobar.tiff or foobar.png or foobar.jpg,
foobar.tif will be 
   deleted and foobar.txt – overwritten. No
warning is given.

 – Uppercase extensions (like JPG or PNG) are
not supported, and
   produce a warning that the script does not
handle these types of 
   files. Also the long jpg extension \"jpeg\" is
not supported...

I am afraid I will not be spending more time on
this menu to solve
these problems by myself (I have already surpassed
myself in bash
when doing this script already), but I will gladly
incorporate the
patches anyone sends me or posts here.

