Data Conversion Laboratory
This article ( http://www.DCLab.com/xmlbenefits_p2.asp ) is provided through the Data Conversion Laboratory website.  To subscribe to DCLnews go to http://www.dclab.com/request_subscription.asp.

WHITE PAPER
Tell me again ...
Why should I care about XML?

(
PAGE 2)
Return to page 1

Costs of XML

PDF: An alternative to XML?
PDF is a proprietary page representation format developed by Adobe Systems. It puts documents in a "container" that preserves not only the text but also the image of the page. PDF can be generated directly by many traditional word processing packages. It can also be generated by scanning paper documents.

PDF does not have any of the content tagging capabilities of XML (except for limited linking). And, although widely accepted, PDF is not a recognized independent standard. PDF files are binary; besides text they may contain images of various types, postscript, and other binary information. All this is useful, but means PDF is not as portable as XML.

Furthermore, when PDF is generated from paper, text accuracy is very poor. Although readers may see what appears to be a perfectly usable page, what is actually being displayed is a bitmap image of the page. The text itself, extracted via an OCR process during the PDF conversion, is not directly visible. It is searchable -- but if the accuracy is poor, as is inevitable with uncorrected OCR, the searches will be inaccurate, missing many potentially important "hits" and producing irrelevant hits. Correction is possible, but difficult and expensive -- possibly exceeding the cost of an XML conversion.

PDF files are generally large and unwieldy, especially when the page image is preserved in bitmap form (usually the case when PDF was generated from paper). This means they are difficult to transport over networks or to make available over the web.

Data Conversion Laboratory can and does do PDF conversions where appropriate. We recommend, however, that they be limited to situations where paper is being eliminated for space reasons, and the documents are not frequently accessed, but must be available when required. We recommend XML for "live" data that needs to be frequently accessed, modified, or searched.

For further information on PDF, read:
- Converting PDF to XML: Can it be done easily? (FAQ)
- PDF Conversion: How, For Whom, And When? (Part 1)
PDF Conversion: How, For Whom, And When? (Part 2)

Conclusion: Use XML! It's just better ...
DCL has a wide variety of experience converting data from many formats into many formats. Our expertise extends well beyond the domains of XML/XML, so we don't have an XML axe to grind. But we believe that XML should be the format of choice for all industries who need to manage their "intellectual capital." And we recommend the use of XML in these circumstances. Not because it is legally mandated -- though in many cases it is -- but because it provides the most attractive package of benefits at justifiable cost. The truth is, we often find ourselves saying: "Use XML! It's just better."

DCLnews Editorial

Read more...Read more XML articles at DCL Library

FREE Tech Newsletter!
Subscribe to DCLnews for the latest tech, XML/XML, and e-Publishing news. Plus top stories, reports, and interviews. Click here to subscribe.

Return to top


Data Conversion Laboratory, Inc.   61-18 190th St., 2nd Floor, Fresh Meadows, NY 11365   718-357-8700   convert@dclab.com

Copyright © 1997-2010  Data Conversion Laboratory, Inc. All rights reserved.