Decapod 0.4 - GenPDF Testing (jhung)

Gen PDF Testing

1. Generate Image PDF - Source Images: Original colour, no post processing

  • genpdf version: 34:e46d1748a910
  • ocropus version: 0.4.4 tag
  • input files: 13 JPEG files.
  • Command Executed: decapod-genpdf.py -d ./temp/ -p ./DecapodExport.pdf -v 1 _JGH5058.JPG _JGH5059.JPG _JGH5060.JPG _JGH5061.JPG _JGH5062.JPG _JGH5063.JPG _JGH5064.JPG _JGH5065.JPG _JGH5066.JPG _JGH5067.JPG _JGH5068.JPG _JGH5069.JPG _JGH5070.JPG
  • Output: greyscale PDF with each input image on separate pages. Top half of each page is blank. Input image dimensions 3872x2592).

2. Generate Image PDF - Source Images: Original colour, and pre-rotated and cropped

  • genpdf version: 34:e46d1748a910
  • ocropus version: 0.4.4 tag
  • input files: 13 JPEG files.
  • Command Executed: decapod-genpdf.py -d ./temp/ -p ./DecapodExport.pdf -v 1 01.jpg 02.jpg 03.jpg 04.jpg 05.jpg 06.jpg 07.jpg 08.jpg 09.jpg 10.jpg 11.jpg 12.jpg 13.jpg
  • Output: Greyscale PDF. Input images fitting roughly lower 2/3 of the page. Input image dimensions 2272x2223.

3. Generate Image PDF - Source Images: Pre-Binarized, and pre-rotated

  • genpdf version: 34:e46d1748a910
  • ocropus version: 0.4.4 tag
  • input files: 13 binarized PNG files. (see attached ZIP)
  • Command Executed (Image PDF): decapod-genpdf.py -d ./temp/ -p ./DecapodExport.pdf -v 1 01.png 02.png 03.png 04.png 05.png 06.png 07.png 08.png 09.png 10.png 11.png 12.png 13.png
  • output: Binarized PDF file with text occupying the lower 2/3 of the page. See attached ZIP.

4. Generate OCR PDF - Source Images: Pre-Binarized, and pre-rotated

  • genpdf version: 34:e46d1748a910
  • ocropus version: 0.4.4 tag
  • input files: 13 binarized PNG files. (see attached ZIP)
  • Command Executed (OCR Text): decapod-genpdf.py -d ./temp/ -p ./DecapodExport.pdf -t 2 -v 1 01.png 02.png 03.png 04.png 05.png 06.png 07.png 08.png 09.png 10.png 11.png 12.png 13.png
  • output: Error. Did not produce a PDF. See attached TXT.
outputing seg2bbox info for page 12
outputing seg2bbox info for page 13
./temp//0004.pseg.txt
[warn] file could not be opened:  ./temp//0004/010001.cseg.txt
[Error] PDF generation did not work as expected! (['ocro2pdf.py', '-d', './temp/', '-p', './DecapodExport-ocr.pdf', '-t', '2', '-v', '1'])