Ocropus 0.4+ is now available.
Following notes are based on the instructions for installing OCRopus 3.0. The following notes were generated in the following environment:
- OCRopus 0.3
- iulib 0.3
- Ubuntu 8.10 (Intrepid Ibex)
Before you Begin
Make sure your Ubuntu installation has the following components installed (not installed by default):
- g++
- scons
- svn
If you have problems downloading any of the packages or libraries required, try changing the server which apt-get or Synaptic use. In our experience, mirror.csclub.uwaterloo.ca/ubuntu was good.
The easiest way to change the download server is to use Synaptic Package Updater and change the repository location. Remember to Refresh in order for the changes to apply.
1. Installing iulib
- Download iulib 0.3 package from iulib's google code page. http://code.google.com/p/iulib/
- Get any missing libraries, run:
sudo apt-get install libpng12-dev libjpeg62-dev libtiff4-dev libavcodec-dev libavformat-dev libsdl-gfx1.2-dev libsdl-image1.2-dev
- if you get errors downloading any of these libraries, change your package download server. See the note at the top of this document.
- Run:
sudo scons install
(This will help avoid an error in the next step. Make sure to havescons
installed.) - Run:
sudo make install
You may get the following error:
./vidio/vidio.cc:484 error: cannot convert 'ByteIOContext**' to 'ByteIOContext*' for argument '1' to 'int url_fclose(ByteIOContext*)'
To fix this, change Line 484 from url_fclose(&oc->pb);
to url_fclose(oc->pb);
- For more information, see this post here: http://code.google.com/p/iulib/issues/detail?id=2
2. Installing Tesseract
- Make sure to have SVN installed prior to this step.
- Installed tesseract according to instructions on OCRopus' install guide (http://sites.google.com/site/ocropus/install-0-3)
svn co http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr cd tesseract-ocr ./configure make sudo make install
3. Installing OCRopus
Following the instructions listed here: http://sites.google.com/site/ocropus/install-0-3
./configure --without-fst --without-leptonica make sudo make install
OCRopus instructions only tell you to run make install
, but you should be using sudo make install
to install properly.
- Note: running
configure
with the--without-SDL
flag will cause an error during themake
process:
/home/leviticus/ocropus/iulib/utils/dgraphics.cc:154: undefined reference to `SDL_FillRect' /home/leviticus/ocropus/iulib/utils/dgraphics.cc:155: undefined reference to `SDL_UpdateRect' /home/leviticus/ocropus/iulib/utils/dgraphics.cc:157: undefined reference to `SDL_UpdateRect' collect2: ld returned 1 exit status make[1]: *** [ocroscript] Error 1 make[1]: Leaving directory `/home/leviticus/ocropus/ocropus-0.3/ocroscript' make: *** [all-recursive] Error 1
Therefore, make sure to not use the --without-SDL
flag.
4. Running OCRopus
Run at command line:
ocroscript recognize data/pages/alive_1.png
This will print an HTML document to stdout
with the text conversion of the image. Pipe to Firefox or redirect output to file and open in browser.