0.4 functionality
- Independent left-right (or L-R) remote capture from a set of supported cameras
- L-R page ordering and rotation correction
- L-R image stitching and binarization (same process?)
- thumbnail generation
- single image delete and reorder
- output to type 1 image PDF
Testing genpdf
Using Aug 2, 2011 tip functionality
Observations:
- automatic deskew correction if conditions are right
- not sure what conditions need to exist for deskew to occur
- does not happen with images -> possibly related to detected page geometry?
- type 2 and 3 pdf generation functional. Quality of detected text not completely accurate (this is expected?).
- binary version stored in Book Store
- given a set of images that are handwritten with some colour (example), Type 2 PDF comes out greyscale with no selectable text (example).
- given a set of images that are strictly colour images produce a Type 1 PDF:
- decapod-genpdfEXP.py produces an empty PDF.
- decapod-genpdf.py produces a greyscale PDF .
- content of output PDFs are fit to A4 by default.
- For type 3 PDF, genpdf automatically reverts to type 2 if character segmentation fails.
- For type 1 PDF, binary and paragraph map generation, and line segmentation are always performed.
- Consequence is that images with no detectable text are subjected to long processing.
- Some PDF metadata generated - i.e. PDF author is "DECAPOD GenPDF", Producer is "ReportLab http://www.reportlab.com"
Future Improvements & Questions
- Colour output option: binary, greyscale, and colour.
- Straight Image PDF generation option (no segmentation requested).
- Option to disable page & text segmentation.
- PDF metadata options (user can specify their own metadata).
- Ability to disable auto-deskew (so auto-deskew can happen earlier in the pipeline and not have to attempt deskew twice).
- Does upgrading from Ocropus 0.4.4 improve text results? What work will we need to do to support a newer version of ocropus?
- Is there a fail-quick method of detecting text on a page?
Example Test: Type 2 PDF Test - "Computer Generated 600DPI document"
- Running: Tip version of Decapod with Ocropus 0.4.4
- Command run: decapod-genpdfEXP.py
- using first 10 images of the Nuforc computer generated document.
- Generated PDF
- Virtual machine spec: Intel 2.8GHz quad-core, 2GB RAM
- PDF Generation took 10s.
- Whole process took 372s.
Example Test: Type 3 PDF Test - "Computer Generated 600DPI document"
- Running: Tip version of Decapod with Ocropus 0.4.4
- Command run: decapod-genpdfEXP.py
- using first 10 images of the Nuforc computer generated document.
- Generated PDF
- Virtual machine spec: Intel 2.8GHz quad-core, 2GB RAM
- PDF Generation took 25s.
- Whole process took 1097s.
Still to do
More testing of:
- text detection / recognitionFont generation (type 3 PDF)
- type 2 generation
- determining the ideal quality of input image to get best results from genpdf.
- changing character model may improve results?
General workflow
- Acquire images: remote capture, or file import.
- Page management and QA
- Output files
1. Acquire images
Remote Capture
- 3 modes of remote capture: structured light, stereo, and L-R.
- Each mode has its own calibration scheme.
- Each 3D capture mode has its own dewarp scheme.
Import from file system
- Reasonable assumptions?
- Files numbered in sequence.
- Questions
- What if images are different resolutions? Unpredictable segmentation results.
- What if pages are inconsistent sizes? Unpredictable segmentation results.
- What if images contain a mix of page spreads and single pages? Okay if resolution and page sizes are consistent.
2. Page Management
- During page management, present thumbnails and images as it would be on output to file.
Image with text case:
- automatically deskew image
- automatically binarize image? display images/thumbnails in binary? or we keep it colour and save any colour processing as part of an export "preview"?
Image with no recognizable text case:
- manual deskew correction for image
Global functions
- automatically 3D dewarp pages if necessary
- Crop image(s)
- Delete image(s)
- Reorder image(s)
3. Output Files
- choose colour depth of output: binary or colour (original)
- choose pdf format: image, overlaid, or scalable.
- Any way to fail quickly at page segementation? - this way Decapod can switch to image PDF output if it doesn't detect text.