Quantcast

Sophisticated Data Matching Improves OCR Systems

Significant cost reduction and increased quality

The validation of OCR output by approximate matching can dramatically reduce the costs of manual error handling. Furthermore, it increases the final data quality.

What does the MatchMaker system do?

Results from the OCR engine, such as:

Ban?y Srnith
?ort?ou?h Roed?ondor SWl5 3TA X
Oni?edKin?dom

will be validated and normalized to:

Henry Smith
Portsmouth Road
London SW15 3TR
United Kingdom

 

It is possible to automatically correct much of any OCR output errors (unrecognized characters, spelling errors, incomplete strings and words).  This allows for an increased throughput of automatically recognized documents, whilst reducing the risk or amount of cost intensive manual correction.


      MatchMaker OCR for:

  • Document management systems (DMS)
  • Logistics documentation
  • Post Office Management
  • Insuarnce Claims Processing

The high-performance approximate matching technology yields to high data-consistency.  Scanned data records can be matched against accurate data from your records, dictionaries, address databases, and other reliable data sources.  Furthermore, the approximate matching can be used as a validity check during the manual correction process.  This is important for off-shoring operations, because manual corrections are often performed by non-native speakers.  Depending on the errors the correction is done automatically or by manual validation.


To make the manual validation more effective, MatchMaker provides error-tolerant suggestions, which the operator just has to confirm through a simple workflow.  This process eliminates human errors originating from manual OCR validation and error handling.

Ultra-Fast Approximate Matching

Intensive R&D at Exorbyte led to quantum leaps in terms of approximate matching speed.  MatchMaker can deliver high-quality approximate results within milliseconds even if the reference database contains millions of records.  Performance is and remains one of Exorbyte's top priority.

High-Quality Approximate Matching

The core of Exorbyte’s approximate matching consists of a combination of various high-performance algorithms that have been combined to yield an exceptional matching quality.  In addition MatchMaker is able to perform approximate fault tolerant multi-field matching that can use different matching methods for the individual fields.

The quality of the approximate matching is very important for the ROI of enterprise OCR solutions.  Every automatic correction of OCR output reduces the costs associated with manual validation while yielding a higher quality of information.

Software + Data

MatchMaker is also available in combination with standard reference-data like the addresses, names, countries, and many other kinds of dictionaries.

System Requirements

OS: Windows, Linux, Unix
APIs: Sockets, C++, COM, PHP, Java, Tcl, Python
RAM: Approx. 80 MB per 1 million records

Further Information:

exbullet.png  OCR Application: Using fault-tolerant fuzzy search technology to correct OCR machine-read data : Contact Us for a case study

 

 

      Customers