|
What is it:
|
Harmonizer™ is a software process to identify & eliminate redundant content in document collections. The software analyzes 1000's of documents at a time in order to objectively measure extent of duplication, locate duplicate and “near duplicate” content, eliminate extraneous content, and harmonize text variations.
|
|
Harmonizer™ Capability:
|
- Identifies content reuse metrics in a document collection.
- Identifies exact match, similar match, and dissimilar match ‘granules’.
- User interface to resolve "near matches"
- Works with SGML, XML, HTML, Word RTF files, Framemaker MIF files, Interleaf ASCII, and many others.
- Works with large document sets.
- Updates files, inserts effectivity and conditional information and produces files suitable for direct loading to a Content Management System (CMS) or IETM.
- Optional service to resolve and correct variations in documents.
|
|
Process:
|
User provides document collection via FTP or CD, defines granularity, and selects matching criteria. Harmonizer™ processes the batch of data returning statistics on degree of redundancy, along with identification of redundant data and location within the document set. Automated tools are available to facilitate correcting, editing, and eliminating of redundant data.
|
| Results:
|
- Metrics on reuse potential in document set (critical for calculating ROI of implementing content reuse.)
-
Detailed analysis reports displaying location and content of exact and close matches.
|
Graphic display of redundancy In this document set over 60% is potentially redundant - 39.25% with exact matches, and a further 21.59% with close matches
|
|
|
Applications:
|
- Determine reuse potential for ROI calculation.
- Harmonize content (reduce ‘similar’ matches) to provide consistent information.
- Clean-up typographical errors
- Implement reuse which reduces size of data set, reduces conversion cost, reduces translation costs, and improves efficiency of updating information.
|
| DCL Experience:
|
- DOD (Air Force, Army, Marines, Navy)
- Aerospace
- Pharmaceutical
- Semiconductor
- Software
|
|
Differentiators:
|
- The process produces in minutes what would take weeks in a manual process.
- Solution is unique with patent pending.
- DCL has 20+ years in conversion business with over 10,000 projects completed.
- Authored chapter on ‘Legacy Data Conversion’ in
Columbia Guide to Digital Publishing.
|
|
Languages:
|
Experience with most foreign languages including Latin based and double-byte characters.
|
|
Partnerships:
|
|
|
|
For More Info:
|
Data Conversion Laboratory, Inc.
http://www.dclab.com
(800) 321-2816 x267
e-mail reuse@dclab.com
|