DCL  
representational space

   Refer a friend  Email this Page
   Print friendly version Print-Friendly
   Request Information Request Information
   Subscribe  Subscribe

          LinkedInTwitterFacebook

representational space
Services
Content Reuse
Document Conversion
Quality Assurance
Rendering & Publishing
SPL Labeling
Source Formats
   - Word Processors
   - Publishing Systems
   - PDF
   - Other Formats
Target Formats
   - XML & SGML
   - DITA
   - Military DTDs
   - NLM
   - Public DTDs
   - S1000D
   - Other Standards
Other Services »
representational space
Memberships

The DCL Guide to Public Domain DTDs
A breakdown of commonly-used free DTDs

Public Domain DTD Guide Simply put, a document type definition (DTD) is a document structure. It defines what elements and attributes are allowed, and sometimes it determines how these elements and attributes are displayed. In other words, a DTD organizes your data and allows your content to be displayed consistently in whatever output medium you choose. A DTD that fits your specific data needs (in terms of both appearance and function) will allow you to take full advantage of all the benefits of XML.

Public Domain DTDs

While making your own DTD may produce a structure customized to your needs, it also requires a lot of knowledge, time, and effort (among other costs).

Only a few years ago, if you needed structured documentation your only option was to create the structure yourself, or else pay a consultant to develop one for you. While making your own DTD may produce a structure customized to your needs, it also requires a lot of knowledge, time, and effort (among other costs). And since every organization marched to its own DTD drummer, there was no hope for the easy exchange of tools or data.

Fortunately, DTDs have come a long way in a short time. Today, the multitude of public domain document type definitions (DTDs) available for free have provided a convenient alternative to proprietary or custom DTDs. While some are appropriate only for very specific types of documentation, the most common public domain DTDs tend to be more generally applicable.

Which DTD Is Best?

While some of the uses of these DTDs do overlap in function, the ultimate answer of which DTD is best is not so simple. The real question is: which DTD will work for your specific needs?

There are strong opinions on this question, as well as some classic rivalries like NLM vs. TEI and DITA vs. DocBook. While some of the uses of these DTDs do overlap in function, leaving room for some actual competition, the ultimate answer of which DTD is best is not so simple. The real question is: which DTD will work for your specific needs?

Each DTD has pros and cons that are specific to a given application. For instance, DITA is great for documentation; but try to use it to present a full-text article or book, and you'll find you need to make many modifications.

Another thing to consider is that all of these publicly available DTDs are customizable; that is, you can modify them to suit your needs to varying degrees. However, customization costs money and time to implement, and nearly always comes at the expense of standardization.

If you're wrestling with the question of which DTD might best suit your needs, then you're in luck: we did some of the work for you. The following guide is designed to give a quick overview of the basics of the public domain DTDs that you are most likely to encounter.

NLM     TEI and TEI Lite

DITA    DocBook    S1000D

XHTML    EPUB



NLM

Original purpose

In 2003, the National Library of Medicine (NLM) created the Journal Archiving and Interchange DTD (also known as the NLM DTD), as a common format for medical journal articles, as well as the NLM book DTD, designed specifically for textbooks.

What is it used for now?

Used for various different types of journals and books-many having nothing to do with medical literature-the NLM DTD provides a useful structure for almost any kind of full-text article. It has been called the de facto standard DTD for full-text publishing.

What is it good for?

The NLM DTD was developed in response to publishers' needs, so this "reality-based" DTD provides a flexible structure that can accommodate many content irregularities without requiring customization. It is also well known, widely used, and allows for the creation of rich custom metadata.

When may another DTD be a better choice?

Content-wise, the NLM DTD is somewhat more rigidly defined than TEI.

Possible public domain alternatives

TEI, TEI Lite


TEI AND TEI LITE

Original purpose

The text encoding initiative (TEI) DTD was developed in 1994 as a standard format for digital texts in the humanities, social sciences, and in linguistics. TEI Lite was introduced as a simplified version of TEI, and is now the much more commonly used of the two. The two names are sometimes used interchangeably.

What is it used for now?

If NLM is the de facto standard DTD for full-text scientific or medical publishing, then TEI and TEI Lite are the de facto standards for full-text electronic articles in the humanities academic community.

What is it good for?

The TEI and TEI Lite DTDs are widely used and supported. They are somewhat less rigidly defined than the NLM DTD, and more accommodating of humanities literature.

When may another DTD be a better choice?

Journal articles to be posted to PubMed (or many of the other online medical and scientific databases) must adhere to the NLM DTD.

Possible public domain alternatives

NLM


DITA

Original purpose

DITA was originally developed in 2001 as a modular, reuse-friendly DTD for software documentation.

What is it used for now?

DITA is now being used for all kinds of technical documentation (not just software) and help guides.

What is it good for?

DITA has been described as being a "step beyond" other DTDs due to its use of document maps. These maps allow for document structure to be created by simply arranging content modules in the desired order. This makes DITA a good choice for documentation in which similar chunks of content appear multiple times in different locations, since content modules can be easily managed from a centralized database and reused wherever necessary.

Indeed, DITA is best suited to content that is modular and context-independent in nature. It is ideal for projects in which you want to reuse some of the same content within a document, between documents, or even among different projects. Documentation that must be translated, for example, can benefit greatly from DITA's reusable modules; with DITA, a given chunk of content need only be translated once, no matter how many times it appears throughout a set of documentation.

When may another DTD be a better choice?

Since DITA works by sorting content into small chunks so that it can be reused, it is not well equipped to handle full-text articles or other context-dependent content.

Possible public domain alternatives

DocBook, S1000D


DOCBOOK

Original purpose

Created in 1991 by HaL Computer Systems and O'Reilly and Associates, DocBook was designed for computer hardware and software documentation purposes.

What is it used for now?

DocBook is now used for all types of documentation.

What is it good for?

DocBook's presentation-neutral form allows content to be published in numerous other formats (including HTML, XHTML, EPUB, and PDF). DocBook is simple to download and set up, and it has been around for long enough that it is stable, well known, and well supported. A sophisticated set of rendering tools is available for use with this DTD.

While a DTD like DITA provides a general structure that can easily be specialized to your needs, DocBook comes with more options built in. If the modifications you would make to DITA are already within DocBook, then DocBook may be a better DTD for you, since it may let you do what you need without losing the benefits of a standardized structure.

When may another DTD be a better choice?

DocBook also doesn't allow for modular content organization or document mapping (like DITA), so it is not as well suited for content reuse.

Possible public domain alternatives

DITA, S1000D, NLM (for textbooks)


S1000D

Original purpose

S1000D was developed in the 1980s for the production of technical publications for military aircraft.

What is it used for now?

S1000D has been modified for use with various different types of equipment documentation. It is now a popular DTD for maintenance and operations documentation for commercial equipment, as well as for military technical publications of all sorts.

What is it good for?

As an international specification, S1000D is widely used in the military and in the aviation industry. Its complex hierarchical structure and predefined data modules allow for minimal flexibility, so your content is well organized, standardized, and ready for reuse.

When may another DTD be a better choice?

S1000D adheres to a rigid hierarchical structure, with predefined data module codes designed to deal with equipment documentation. It is customizable to an extent, but if your needs fall outside the specific realm of equipment documentation, you may be better served pursuing another DTD.

Possible public domain alternatives

DITA, DocBook


XHTML (TAG SET)

Original purpose

Extensible hypertext markup language (XHTML) is not technically a DTD, but rather a tag set. XHTML refers to a family of markup languages developed to build on HTML, the language used to write web pages. XHTML was designed to make HTML more extensible, so while it is more restrictive than HTML, its requirement that documents be well formed allows it the advantage of increased versatility.

What is it used for now?

XHTML is a set of base tags for web rendering. It is not used for content tagging. XHTML is preferred to HTML in many cases because of XHTML's versatility and tag minimization.

What is it good for?

Web pages

When may a better tag set be a better choice?

XHTML is about appearance, not content. If you are hoping to do more with your data than present it on a webscreen, you will be better served by a richer content tag set.


EPUB

Original purpose

A subset of XHTML, EPUB was developed for e-book publishing and came on the scene as an official standard in 2007.

What is it used for now?

EPUB remains the publishing standard of choice for e-book publishing.

What is it good for?

As it was designed to handle "reflowable" content, the appearance of EPUB documents can be easily customized to suit the needs of different display devices.

When may a better tag set be a better choice?

If you are trying to present sophisticated content with complex formatting, EPUB may be too limited for your needs.

While EPUB is has been widely adopted worldwide, not all media readers use or support EPUB the same way, and some specific media readers may require a different standard altogether (Kindle does not support EPUB, but rather uses its own format).

DCLnews Editorial
March 2010

 
representational space
DCL Library
Articles, fact sheets, presentations and white papers
representational space
Events

7th National Conference of African American Librarians,
August 4–8, 2010
Birmingham, Alabama

Society of American Archivists Annual Meeting,
August 10–15, 2010
Washington, D.C.

Nuclear Information Management (NIRMA) Conference,
August 15–18, 2010
Summerlin, Nevada

Internet Librarian Conference,
October 25–27, 2010
Monterey, California

More Events »

representational space

News
Brill Again Turns to Data Conversion Laboratory (DCL™) for Key Project


DCL's “Dan Tonkery on the iPad and the Future of Technical Publications” Published in CIDM News


DCL's “Guide to Conversion Cost Variables” Published in Best Practices Newsletter


DCL's “Dan Tonkery on the iPad and the Future of Technical Publications” Translated on German Blog

More News »


representational space
representational space representational space representational space representational space representational space representational space representational space


Corporate office:
61-18 190th Street, 2nd Floor, Fresh Meadows, NY 11365
718-357-8700
Data Conversion Lab
Copyright © 1997-2010  Data Conversion Laboratory, Inc. All rights reserved.