Decapod Glossary of Technical Terms

The following definitions are within the context of the Decapod Project and are not necessarily general definitions.

Pre Capture

Camera Calibration
The process of postioning a book spread so that it is completely within the field of view of both cameras. Calibration also includes marking of the center margin. May also include colour and greyscale reference card capturing in the future.

Post Capture Process

1. Dewarping

1. Dewarping

The process of removing the surface distortions in a photographed page. In effect dewarping creates an image equivalent to a flatbed scanner, without the destructive properties of a flatbed. Also see Affine Dewarping and Stereoscopic Dewarping.

  • Affine Dewarping
    An approximation to perspective dewarping.
  • Stereoscopic Dewarping
    The process of using two images of a book spread with different perspectives to produce a 3D map of the surface so it can be flattened (distortions removed).
  • Perspective Dewarping
    A process for turning flat pages that have been captures with a camera from an off-axis position into pages that appear like the have been captured on a flatbet scanner.

2. Splitting

The process of turning an image of a page spread and splitting it into two new images of left and right page.

2. Normalization

1. Normalization
The automatic or manual process in which an image is adjusted to have the same visual qualities (brightness, contrast, threshold, exposure etc.) as other pages in the book.

  • Image Enhancement
    The process of adjusting the brightness, contrast, and black & white thresholds of an image so that text is more legible for human reading or glyph generation.
  • Threshold Adjustment
    A linear adjustment with black on one end and white on another which changes the quality of a bitmap image. Increasing the black threshold makes more details appear black, but may introduce anomalies and noise to an image. Increasing the white threshold removes artifacts from a bitmap image, but may obscure details. Finding the ideal threshold that achieves the best clarity will improve page segmentation.

2. Binarization
The process of converting a colour or greyscale image into a series of black and white pixels (or bits of 0s and 1s). Binzarization of an image is executed to help page segmentation and glyph generation. aka. Bitmap Conversion.

  • Bitmap
    A black and white image (not greyscale) - can be interpretted as a sequence of bits (0s and 1s representing black and white).
  • Bitmap Conversion
    Same as Binarization.

Page Analysis

1. Document Structure

Document Structure
The components that make up a single page - paragraphs, lines of text, images, captions, etc. The boundaries and flow of these components is represented by a RGB / pixelwise map of the bitmap page. See https://docs.google.com/View?id=dfxcv4vc_92c8xxp7 .

Page Segmentation
A general term to describe the process of performing layout detection, and line segmentation which is accomplished through Pixelwise Analysis.

  • Pixelwise Analysis
    The process of determining the physical division of a document page into special areas, columns, paragraphs, and text lines. This process is pixel accurate, and each bitmap pixel is given values for Column, Paragraph, and Line Number in the form of a RGB value (R for column, G for paragraph, and B for line number). The RGB values are stored as a RGB .PNG file.
    • Line Segmentation
      The process of detecting the lines on the page. This generates the B value for each pixel.
    • Layout Detection
      The process of programatically determining the document structure of a page. This generates the G value for each pixel.
    • Document Flow
      The visual and logical flow of information on a page. This generates the R value for each pixel.
      • Text Flow
        The logical order of text on a page.
      • Hierarchy Analysis
        The process of analysing a page to determine its document flow.

Segmentation Correction
A general term to describe the manual process of modifying the results of page segmentation.

  • Flow Correction
    The process of manually fixing the visual and/or logical flow of information on a page.
  • Layout Correction
    The process of manually adjusting the boundaries or categorization of logical information regions in the document structure.

Export

1. Character Analysis

1. Character Segmentation
Analysing a bitmap image to determine individual characters / letters / symbols.

2. Tokenization
The conversion of bitmap text to font generated text.

3. Font Generation
The process of creating scalable fonts from a bitmap image. Also called glyph generation.

  • Glyph generation
    Another term for Font Generation.

4. Token Clustering
Also known as Glyph Clustering.  Characters on the page with similar shapes are identified and associated with one another and combined.  By combining multiple instances of the same character, the shape of the original character can be recovered more accurately. The clustered shapes are used as the basis for constructing a PDF font corresponding to the character shapes in the document.

  • Clustering
    A general term that implies token clustering.

Other Terms