Post Decapod 0.4 examination and notes
0.4 functionality
Independent left-right (or L-R) remote capture from a set of supported cameras
L-R page ordering and rotation correction
L-R image stitching and binarization (same process?)
thumbnail generation
single image delete and reorder
output to type 1 image PDF
Testing genpdf
Using Aug 2, 2011 tip functionality
Observations:
automatic deskew correction if conditions are right
not sure what conditions need to exist for deskew to occur
does not happen with images -> possibly related to detected page geometry?
type 2 and 3 pdf generation functional. Quality of detected text not completely accurate (this is expected?).
binary version stored in Book Store
given a set of images that are handwritten with some colour (example), Type 2 PDF comes out greyscale with no selectable text (example).
given a set of images that are strictly colour images produce a Type 1 PDF:
decapod-genpdfEXP.py produces an empty PDF.
decapod-genpdf.py produces a greyscale PDF .
content of output PDFs are fit to A4 by default.
For type 3 PDF, genpdf automatically reverts to type 2 if character segmentation fails.
For type 1 PDF, binary and paragraph map generation, and line segmentation are always performed.
Consequence is that images with no detectable text are subjected to long processing.
Some PDF metadata generated - i.e. PDF author is "DECAPOD GenPDF", Producer is "ReportLab http://www.reportlab.com"
Future Improvements & Questions
Colour output option: binary, greyscale, and colour.
Straight Image PDF generation option (no segmentation requested).
Option to disable page & text segmentation.
PDF metadata options (user can specify their own metadata).
Ability to disable auto-deskew (so auto-deskew can happen earlier in the pipeline and not have to attempt deskew twice).
Does upgrading from Ocropus 0.4.4 improve text results? What work will we need to do to support a newer version of ocropus?
Is there a fail-quick method of detecting text on a page?
Example Test: Type 2 PDF Test - "Computer Generated 600DPI document"
Running: Tip version of Decapod with Ocropus 0.4.4
Command run: decapod-genpdfEXP.py
using first 10 images of the Nuforc computer generated document.
Virtual machine spec: Intel 2.8GHz quad-core, 2GB RAM
PDF Generation took 10s.
Whole process took 372s.
Example Test: Type 3 PDF Test - "Computer Generated 600DPI document"
Running: Tip version of Decapod with Ocropus 0.4.4
Command run: decapod-genpdfEXP.py
using first 10 images of the Nuforc computer generated document.
Virtual machine spec: Intel 2.8GHz quad-core, 2GB RAM
PDF Generation took 25s.
Whole process took 1097s.
Still to do
More testing of:
text detection / recognitionFont generation (type 3 PDF)
type 2 generation
determining the ideal quality of input image to get best results from genpdf.
changing character model may improve results?
General workflow
Acquire images: remote capture, or file import.
Page management and QA
Output files
1. Acquire images
Remote Capture
3 modes of remote capture: structured light, stereo, and L-R.
Each mode has its own calibration scheme.
Each 3D capture mode has its own dewarp scheme.
Import from file system
Reasonable assumptions?
Files numbered in sequence.
Questions
What if images are different resolutions? Unpredictable segmentation results.
What if pages are inconsistent sizes? Unpredictable segmentation results.
What if images contain a mix of page spreads and single pages? Okay if resolution and page sizes are consistent.
2. Page Management
During page management, present thumbnails and images as it would be on output to file.
Image with text case:
automatically deskew image
automatically binarize image? display images/thumbnails in binary? or we keep it colour and save any colour processing as part of an export "preview"?
Image with no recognizable text case:
manual deskew correction for image
Global functions
automatically 3D dewarp pages if necessary
Crop image(s)
Delete image(s)
Reorder image(s)
3. Output Files
choose colour depth of output: binary or colour (original)
choose pdf format: image, overlaid, or scalable.
Any way to fail quickly at page segementation? - this way Decapod can switch to image PDF output if it doesn't detect text.