Data Conversion Laboratory, Revolutionizing Publishing for the Digital Age 
  DCLab.com | About DCL | Tech Info | Press Info | Contact Us | DCLNews | Partners | Wiki | Client Area     
menu
Data Conversion Lab

About DCL
  Why go to DCL?
  Clients
  Company Background
  Management
  DCL in the News
  Events
  Mission

DCL News
  Current Issue
  Back Issues
  Subscribe

Technology
  Technology Resources
  FAQ's
  Glossary
  Presentations
  DCL Work Tracking

Press Info

Clients' Area

Contact DCL
  Directions
  Request Estimate
  Positions

Books2Bytes
Popular Pages
* Current Issue of DCLnews
* DCL featured in The Columbia Guide to Digital Publishing
* Slash Document Costs
* Ann Rockley on ROI in CM
* PDF Resources
* XML Conversion Resources
* Roundtrip Document Conversion
* DCL Resources Library
*

Converting Legacy Data...

*

Aviation & Aerospace

*

PDF Conversion to XML & MS-Word

*

PDF Conversion

*

Quark to XML

* Getting Content into XML
Fact Sheets
* Public Access for Research Materials
* S1000D Conversion
* Content Reuse Assessment
* Document Conversion
* SPL - Pharmaceutical Industry
* Harmonizer™
* Jeppesen Map Revision Service
Technical Papers
* Why STM Publishers Should Use XML...
* Department of Defense and the Power of XML
* Your Data in XML
* SGML to SGML 1
* SGML to SGML 2
* Quark to XML
* Plan Ahead
* Do it Yourself?
* Encyclopedia
Presentations
* Conversion to XML: Documents versus Data (11/2003)
* Data Migration Considerations  (6/2003)
* Technology for Cost-Containment and Efficiency  (4/2003)
* Converting Textbooks to Meet the National XML Standard for Accessibility  (3/2003)
* More Presentations

WHITE PAPER
Tell me again ...
Why should I care about XML?

Converting to XML not only gives you the ability to publish documents to the Web, print, CD-ROM, and to handheld devices at the click of a button, it also brings very real cost savings ...

In this white paper we look at the benefits of XML and discover how much it costs to get those benefits. We also look at strategies for increasing benefits while at the same time keeping costs down. Plus we touch on PDF -- often viewed as an XML alternative -- and discuss when it is appropriate to use it instead of XML. Note that much of the information in this document applies to SGML, which is the "parent" of XML.

What is XML?
XML (eXtensible Markup Language) is a means of representing text information so that:

  1. Only standard text (both ASCII and Unicode) is used within a document
  2. No formatting information is contained in the document. (A Document Type Definition, or DTD, can be set up to allow formatting information to be included in the XML tagging).
  3. All document elements are clearly identified (for example, <title>Why XML?</title>).
  4. The document typically conforms to a predefined template or DTD. Strictly speaking you don't have to use a DTD, but it is highly recommended that you do.
  5. Mechanisms are provided for linking text within a document to information within the same or other documents. The information being linked can be any XML structure including tables, figures, paragraphs, headings, and so on.
  6. (NOTE: Developers invented an XML sub-technology, or "vocabulary," called XLink. XLink is a  more powerful way of linking from one item to another than is possible in the standard XML mechanism. It allows you to link to an arbitrary place in a document. In standard linking, like that found in HTML, you can only link to something if you've got an anchor to it. If you've got a complex document this can mean inserting thousands of anchors -- a laborious task. With XLink you to point anywhere you like without anchors).

Key benefits of XML
The benefits of using XML as a document representation format are great and apply across all areas of industry. Let's look at what you gain when you adopt XML:

  • Content identification - Perhaps the most important aspect of XML is that text elements are identified, not on the basis of what they look like, but on the basis of what they are -- that is, of their significance in the context of a document. The <title> example above illustrates this, but the concept goes well beyond identifying things like titles, captions, or body text. Depending on need, warning paragraphs can be identified, procedures can be identified in terms of who they are applicable to, and assembly parts can be identified. Tags are user-defined for each document set, so different documents can be tagged in different ways.
  • Databasing - An XML tagged document can be viewed as fielded text. The fielding makes it possible to break documents down to their component parts to any degree of granularity for storage in a document management system. The documents can then be re-assembled in different ways, and for different audiences, without the need to track multiple document versions. This is particularly important in cases where different audiences may need to see different versions of a document (in the military, for example, you might have a "top security" clearance version and a "standard" version).
  • In this way, boilerplate text, such as a standard warning, can be stored once for use in many manuals. When the warning text is changed, it is changed once, not each time it appears. Also, the warning will appear the same way each time it appears, thus avoiding the embarrassment of incorrect text.

  • Enforced structure - XML documents are composed in accordance with a DTD, or Schema, which defines the legal tag set for that document type. It also defines valid and invalid relationships between elements (for example, a <header 2> tag might be defined as valid only when it comes after a <header 1> tag). This "enforced structure" ensures that documents have uniformity -- even when coming from diverse sources.
  • Merging materials from diverse suppliers - The uniform structure and lack of internal formatting makes it easy to merge documents into seamless document sets -- even if they are coming in from different facilities. An XML compliant document management system can track the individual pieces by contributor, if necessary.
  • International Standard - XML is an international standard that is maintained by an independent standards' committee, which means it enjoys widespread support across industry boundaries and gets extensive support from vendors. Being an international standard also means that there are a wide variety of XML editing, document management, validation, and publishing tools available at a range of price and quality levels.
  • Industry standardization - Many industries have adopted standardized XML DTDs to allow documents to be easily exchanged across different areas of industry. In fact, developing inter-industry, data exchange standards based on XML is currently the big thing amongst both developers and firms alike (Microsoft's BizTalk is an example).  Aside from industries coming up with standard DTDs, many organizations have developed new tag sets to fit their subject field. The newspaper industry, for example, recently came up with its own XML-based markup language, called SportsML, makes it easier for sports writers and editors to format, store, and publish sports information for newspapers, websites, and other media. Plus there's MathML and ChemML for the sciences.
  • Platform independent - Because "raw" XML consists only of ASCII and Unicode approved characters (the tags themselves are represented in ASCII), XML data can be moved freely between all hardware and operating system platforms that support these character sets. There are no hardware or operating systems that do not support the ASCII character set and Unicode is now widely supported. The Internet Explorer and Netscape browsers, for example, support it, as do most plain text editors.
  • Software independent - As noted, there are a wide variety of XML-compliant tools available from many vendors. Because XML is an independent standard, tool sets can be upgraded or changed without fear of data incompatibility. Furthermore, many of the mainstream and "low-end" tools are becoming XML compliant in response to market demand for support of these formats. Such software includes WordPerfect, FrameMaker+XML and Ventura Publisher, among others. Support for XML is already available to some degree in most of the Office 2000 products. It is supported extensively in Internet Explorer 5 and above, as well as in recent versions of Netscape. What's more, any text editor that supports Unicode can be used to view/edit XML. And the XSL (eXtensible Stylesheet Language) standard will allow you to publish XML material to paper or a website using publicly available software.
  • Endurance - Appearance-based text representations are constantly changing -- making conversion costly when migrating from one software package to another or even when upgrading an existing software package. There is also potential for data loss when performing such conversions. XML, however, is a "permanent" representation. Even as the standard evolves, there is no problem upgrading data. If the DTD is carefully selected or designed, a conversion to XML will be the last conversion you'll ever need. In a budget-sensitive environment, this is a very important benefit.
  • Repurpose data for different publication media - With XML, formatting is done on a "just in time" basis. As noted, tags identify content, not appearance. Appearance decisions are therefore left until documents are actually published, which means they can easily be modified based on the publication platform. This is a big advantage because what looks good on paper won't look good on screen and vice-versa. XML makes it easy to develop different stylesheets based on the needs of individual publications. The stylesheets map the tags to a set of formatting directives. Thus the same document can easily be published to paper and to the web -- and be customized for each rendition -- simply by customizing stylesheets. When publishing to paper, <title> can be rendered as Times-Roman, 12 point bold. On the web titles might look better in a more web-friendly typeface, like Verdana, in a larger size. They would simply be defined that way in the web stylesheet, without the need to change the document at all. Because XML data is well-fielded ... (continued on next page)

Click to next page >>>

Read more...Read more XML articles at DCL Library

 

FREE Tech Newsletter!
Subscribe to DCLnews for the latest tech, XML/XML, and e-Publishing news. Plus top stories, reports, and interviews. Click here to subscribe.

Return to top

  Structured Product Labeling

Content Reuse

Subscribe

Books2Bytes

DCL Library

Columbia Guide
GSA Schedule
AIA Member
DCL Calendar

Ultramain User Conference 2008, Albuquerque, NM, May 11-15, 2008. More…

PTC User Long Beach, CA, June 2-4, 2008. More…

Mark Logic User San Francisco, CA, June 10-12, 2008. More…

X-Pubs London, England, June 22-24, 2008. More…

Doc Train Life Sciences Indianapolis, IN, June 23-25, 2008. More…

Best Practices Santa Fe, NM, September 15-17, 2008. More…
XyUser Phoenix, AZ, September 22-24, 2008. More…
9th Annual Vasont Users' Group Meeting, Hershey, PA, October 6-8, 2008. More…

DITA/TECHCOMM 2008, Raleigh, NC, November 3-6 2008. More…

ATA e-Business Europe. Details TBA.

 
DCL Calendar

Documentation and Training West 2008 Vancouver, BC, May 6-9, 2008. More…

 
Recent News

CMS/DITA Santa Clara, CA, April 7-9, 2008. More…

DIA Med Comm Orlando, FL, March 10-11, 2008. More…

DIA EDM Philadelphia, PA, February 5-7, 2008. More…

Gilbane Boston Conference Boston, MA, November 29, 2007. More…

The LavaCon Conference on Advanced Technical Communication and Project Management New Orleans, LA, October 27-30, 2007. More…

2007 ATA e-Business Forum Miami, Florida, Oct 17-19, 2007. More…

DITA 2007™-East, Raleigh, North Carolina, October 4-6, 2007. More…

2007 XyUser Group Fall Conference, Boston, MA, Sept 23-26, 2007. More…

Mark Logic 2007 User Conference, San Francisco, CA, May 15-17, 2007. More…

Content Management Strategies/DITA North America Conference 2007, Boston, MA, March 26-28, 2007. More…

DIA 18th Annual Workshop, San Diego, CA. March 4-7, 2007. More…

DIA 2007 EDM & CDM Conference, Philadelphia, PA, Feb 6 - 8, 2007. More…

DITA 2007 – West, San Jose, CA, February 5-7, 2007. More…

Framemaker 2006 Chautauqua, Austin, TX, Nov 8-10, 2006. More…

PTC/User World Event 2006, Grapevine, TX, June 4-6. More…

19th Annual DIA Conference Philadelphia, PA, February 7-9. More…

XyUser's Conference, San Diego, California, September 11-14. DCL's Don Bridges delivered a presentation on "Content Reuse" More…

Structured Product Labeling, Washington, DC, August 23-24. More…

Tri-XML 2005, Raleigh, NC , July 28. DCL's Don Bridges delivered a presentation on "Content Reuse" More…

Pharmaceutical Labeling and Product Identification, Whippany, NJ, June 16-17. DCL's Don Bridges delivered a presentation on "Structured Product Labeling (SPL) and the Implications of Implementing an XML Solution." More…

More…

Data Conversion Laboratory, Inc.   61-18 190th St., 2nd Floor, Fresh Meadows, NY 11365   718-357-8700   convert@dclab.com

Copyright © 1997-2008  Data Conversion Laboratory, Inc. All rights reserved.