OCR

created by parmentier
(thing) by parmentier (3.8 wk) (print)   (I like it!) Sat Nov 13 1999 at 9:59:21

Optical Character Recognizer.

From an image of a paper document, this software gives an electronic version (often only text).

The result has a better presentation when it is combined with Document Analysis techniques.

Many commercial OCR have an error rate of less than 1/100.Than means nearly one error per text line.

OCR errors can be:

  1. confusion: a character (rather, a glyph) instead of another one
  2. insertion: a glyph is added where it should not be
  3. deletion: a glyph is not recognized.

A typical error is the replacement of "m" by "rn", the confusion of the lowercase L and the digit 1, etc.

The most common OCR are: FineReader, TextBridge, OmniPage. Also, they all do a segmentation of the scanned images given to them into blocks of different media (at least texte, table, image).

(place) by asyred (5 y) (print)   (I like it!) Thu Jun 13 2002 at 15:33:42

An exam board in Britain which sets GCSEs and A-level papers. The initials stand for Oxford, Cambridge and RSA examinations board.

Y'know, if you log in, you can write something here, or contact authors directly on the site. Create a New User if you don't already have an account.