Installing OCRopus 0.3

Ocropus 0.4+ is now available.

Following notes are based on the instructions for installing OCRopus 3.0. The following notes were generated in the following environment:

  • OCRopus 0.3
  • iulib 0.3
  • Ubuntu 8.10 (Intrepid Ibex)

Before you Begin

Make sure your Ubuntu installation has the following components installed (not installed by default):

  • g++
  • scons
  • svn

If you have problems downloading any of the packages or libraries required, try changing the server which apt-get or Synaptic use. In our experience, mirror.csclub.uwaterloo.ca/ubuntu was good.

The easiest way to change the download server is to use Synaptic Package Updater and change the repository location. Remember to Refresh in order for the changes to apply.

1. Installing iulib

  • Download iulib 0.3 package from iulib's google code page. http://code.google.com/p/iulib/
  • Get any missing libraries, run: sudo apt-get install libpng12-dev libjpeg62-dev libtiff4-dev libavcodec-dev libavformat-dev libsdl-gfx1.2-dev libsdl-image1.2-dev
    • if you get errors downloading any of these libraries, change your package download server. See the note at the top of this document.
  • Run: sudo scons install (This will help avoid an error in the next step. Make sure to have scons installed.)
  • Run: sudo make install

You may get the following error:

./vidio/vidio.cc:484 error: cannot convert 'ByteIOContext**' to 'ByteIOContext*' for argument '1' to 'int url_fclose(ByteIOContext*)'

To fix this, change Line 484 from url_fclose(&oc->pb); to url_fclose(oc->pb);

2. Installing Tesseract

svn co http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr
cd tesseract-ocr
./configure
make
sudo make install      

3. Installing OCRopus

Following the instructions listed here: http://sites.google.com/site/ocropus/install-0-3

./configure --without-fst --without-leptonica
make
sudo make install

OCRopus instructions only tell you to run make install, but you should be using sudo make install to install properly.

  • Note: running configure with the --without-SDL flag will cause an error during the make process:
/home/leviticus/ocropus/iulib/utils/dgraphics.cc:154: undefined reference to `SDL_FillRect'
/home/leviticus/ocropus/iulib/utils/dgraphics.cc:155: undefined reference to `SDL_UpdateRect'
/home/leviticus/ocropus/iulib/utils/dgraphics.cc:157: undefined reference to `SDL_UpdateRect'
collect2: ld returned 1 exit status
make[1]: *** [ocroscript] Error 1
make[1]: Leaving directory `/home/leviticus/ocropus/ocropus-0.3/ocroscript'
make: *** [all-recursive] Error 1

Therefore, make sure to not use the --without-SDL flag.

4. Running OCRopus

Run at command line:
ocroscript recognize data/pages/alive_1.png

This will print an HTML document to stdout with the text conversion of the image. Pipe to Firefox or redirect output to file and open in browser.