DCLWiki | Client Area  
DCL  

representational space

   Refer a friend  Email this Page
   Print friendly version Print-Friendly
   Request Information Request Information
   Subscribe  Subscribe

          LinkedInTwitterFacebook

representational space
Services
Content Reuse
Document Conversion
Quality Assurance
Rendering & Publishing
SPL Labeling
Source Formats
   - Word Processors
   - Publishing Systems
   - PDF
   - Other Formats
Target Formats
   - XML & SGML
   - DITA
   - Military DTDs
   - NLM
   - Public DTDs
   - S1000D
   - Other Standards
Other Services »
representational space
Memberships

Pharmaceutical Industry;

Why XML?

ABSTRACT: The recent Drug Information Association conference focused on the future of document management in the pharmaceutical industry.  DCL President Mark Gross provided an overview of data use issues, while comparing XML, an emerging industry standard, to other technologies.


Traditionally, distributing information has most commonly meant distributing page image representations. Today, however, there is increasingly a need to re-purpose or re-organize information that was originally intended for paper, so that it can be utilized in many other ways.  Features that have become increasingly important to data utility are the ability to display well on a computer screen and the ability to do full-text searching, both inside documents and across document sets.  Added to this is the need to enforce consistent standards across data sets so that information can be reused and re-purposed for multiple and repeated uses with vendors, customers, and the world at large.

Several alternative technologies to XML exist for electronic distribution.  In addition to the native formats that the software produces, there’s also TIFF, PDF, SGML, and HTML.  Native formats are the files as they were originally developed for word processing or publishing; these include typesetting formats and the desktop packages that became popular with the advent of the PC for publishing.  For print purposes these work well; no new investment or training is needed, and the systems are generally already in place.  But the files are proprietary, and they are quite limited when it comes to the enforcement of standards.  And furthermore, from an electronic distribution perspective (Internet, CD-ROM, eCommerce), they offer virtually no re-purposing capabilities.  In today’s marketplace, that’s a problem.

TIFF is an image format produced by scanning pages.  It provides an exact representation of the page, and is inexpensive to produce.  But files are large, and the format’s not suitable for searching or re-organizing information.

PDF is an almost exact representation of the original document, and it’s also quite inexpensive to produce.  But it’s a proprietary format, it offers limited searching capability, and you can’t edit or modify files.  Added to this, documents intended for print are difficult to read on a computer screen.

SGML is an international ISO standard for information representation. It’s robust and adaptable to many applications, well established in many industries, and allows for full content searching.  It also enforces standards and consistency.  But it’s hard to implement, it requires a significant investment, and professional support staff. SGML tools aren’t suitable for casual users, and creating unstructured documents and forms is difficult.

HTML was designed for the web; 90% of information on the web today is housed in the HTML format.  It’s much easier to use than SGML, it’s widely supported, and it’s automatically generated by some software packages.  But it’s an appearance-based language; tags relate to appearance not content.  Compounding this problem,  it has limited formatting capabilities, it’s difficult to edit, and it’s truly a moving standard.

XML has all the advantages of SGML, but it’s much easier to get started.  You can start gradually and add functionality with experience.  It has features that are important for the web and e-commerce and it’s already become widely supported.  While people need training and tools aren’t yet fully developed, developers have rushed to support the standard.  XML is typically used for large web applications, e-commerce, technical documentation, catalogs, and living documents.

XML separates text and tags – that is content from structure.  This is the key to its versatility, and the reason it’s taken off.  Its uses are many.  You can reformat and restyle for different media, identify components, interchange data, reuse parts, and maintain and output multiple versions of the same document.  “You’ve taken what has traditionally been text used on paper and turned it into a bunch of data elements you can use in various ways,” said Mr. Gross.

View the Presentation Slideshow here


ABOUT DCL: Data Conversion Laboratory is the leader in implementing complex data conversion solutions for Web- and electronic-based publishers and organizations, B2B applications and evolving new technologies. The company supports XML, SGML and all major electronic formats, and since 1981, has extracted, reorganized and repurposed data for a wide range of pharmaceutical clients including: Glaxo-Wellcome, SmithCline-Beecham, Merck, Genentech, Allergan and Eli-Lilly.

The company is located at 61-18 190th St., 2nd Floor, in Fresh Meadows, N.Y. and is privately owned. For more information about Data Conversion Laboratory and its services, please visit the company website at www.dclab.com, or call 1-718-357-8700.

 
representational space
DCL Library
Articles, fact sheets, presentations and white papers
representational space
Events

CIDM Best Practices Conference
September 13–15, 2010
Hampton, Virginia

Vasont Users' Group Meeting
September 27–30, 2010
Hershey, Pennsylvania

Internet Librarian Conference
October 25–27, 2010
Monterey, California

Journal Article Tag Suite Conference (JATS-Con)
November 1–2, 2010
Bethesda, Maryland

SPARC Digital Repositories Meeting
November 8–9, 2010
Baltimore, Maryland

More Events »

representational space

News
Brill Again Turns to Data Conversion Laboratory (DCL™) for Key Project


DCL and GeerStreet Announce Strategic Partnership


DCL's “Dan Tonkery on the iPad and the Future of Technical Publications” Published in CIDM News


DCL's “Guide to Conversion Cost Variables” Published in Best Practices Newsletter


DCL's “Dan Tonkery on the iPad and the Future of Technical Publications” Translated on German Blog

More News »


representational space
representational space representational space representational space representational space representational space representational space representational space


Corporate office:
61-18 190th Street, 2nd Floor, Fresh Meadows, NY 11365
718-357-8700
Data Conversion Lab
Copyright © 1997-2010  Data Conversion Laboratory, Inc. All rights reserved.