Data Conversion Laboratory, Revolutionizing Publishing for the Digital Age 
  DCLab.com | About DCL | Tech Info | Press Info | Contact Us | DCLNews | Partners | Wiki | Client Area     
menu
Data Conversion Lab

About DCL
  Why go to DCL?
  Clients
  Company Background
  Management
  DCL in the News
  Events
  Holiday Calendar
  Mission

DCL News
  Current Issue
  Back Issues
  Subscribe

Technology
  Technology Resources
  FAQ's
  Glossary
  Presentations
  DCL Work Tracking

Press Info

Clients' Area

Contact DCL
  Directions
  Request Estimate
  Positions

Books2Bytes
Popular Pages
* Current Issue of DCLnews
* DCL featured in The Columbia Guide to Digital Publishing
* Slash Document Costs
* Ann Rockley on ROI in CM
* PDF Resources
* XML Conversion Resources
* Roundtrip Document Conversion
* DCL Resources Library
*

Converting Legacy Data...

*

Aviation & Aerospace

*

PDF Conversion to XML & MS-Word

*

PDF Conversion

*

Quark to XML

* Getting Content into XML
Fact Sheets
* Public Access for Research Materials
* S1000D Conversion
* Content Reuse Assessment
* Document Conversion
* SPL - Pharmaceutical Industry
* Harmonizer™
* Jeppesen Map Revision Service
Technical Papers
* Why STM Publishers Should Use XML...
* Department of Defense and the Power of XML
* Your Data in XML
* SGML to SGML 1
* SGML to SGML 2
* Quark to XML
* Plan Ahead
* Do it Yourself?
* Encyclopedia
Presentations
* Conversion to XML: Documents versus Data (11/2003)
* Data Migration Considerations  (6/2003)
* Technology for Cost-Containment and Efficiency  (4/2003)
* Converting Textbooks to Meet the National XML Standard for Accessibility  (3/2003)
* More Presentations

Converting to DITA

Everything you wanted to know (but were afraid to ask)(Part 1)

To get your handy DITA pocket guide, click here

Editor's Note: While many in the technical communications field are very familiar with DITA, we suspect that there are many who are not up to the task, and could use a refresher. This is the first part of a two-part series on the basics of DITA. (With thanks to SiberLogic, Inc, who prepared materials for this reference documentation.)

Introduction to DITA: From GML (Generalized Markup Language) to DITA

The evolution of structured authoring

Purpose

This DITA Guide is intended to serve as an easy-to-use reference to help improve your understanding of Darwin Information Typing Architecture (DITA). It is not a definitive reference on the subject.

Audience

This guide is targeted at technical communicators and information architects getting started with DITA. It assumes a passing familiarity with XML.

Version

This guide is current as of version 1.0 of the OASIS DITA standard1.

What is DITA?

DITA Defined

DITA (Darwin Information Typing Architecture) is a topic-based XML framework used to support the authoring, management, and publication of technical documentation2.

What's in a name?

Darwin: Applies theories of evolution through specialization and inheritance to information types.

Information Typing: Central to DITA is its focus on authoring based upon classified units of information.

Architecture: Goes beyond structured XML to capture best practices for both extensive design and process.

Single Source Publishing with XML

What is XML (eXtensible Markup Language)

XML is a W3C-recommended general-purpose markup language capable of describing many different kinds of data. Its primary purpose is to facilitate the sharing of data across different, potentially disparate systems.

Advantages of using XML for structured content

XML allows authors to separate form from content allowing it to be reused across a wider range of information products. By applying specific markup, information can be retrieved and processed. Using a non-proprietary format also reduces costs.

Challenges of using XML

Prior to DITA, authors were unable to take the most advantage of these benefits. Markup was either too specific to be shared or too generic to be truly useful.

The DITA Solution

DITA introduces several innovations for making XML more useful for authors.

Rich element tagging

Content can be tagged by information type (procedure) instead of by format (list).

Extensible XML for organizations

Specialization makes is possible to share content across different domains. In fact all organizations subscribing to DITA are able to share and access content.

Better opportunities for reuse

DITA's topic-based orientation and reference mechanisms make reuse easier.

Captures best practices for Technical Communication

DITA was designed for technical publications and employs concepts such as Information Mapping® and Minimalism3 in its architecture.

Brief history of DITA

The DITA architecture and DTD were designed by a cross-company workgroup representing user assistance teams from across IBM through 1999-2000. By 2001, DITA was hitting the streets across North America.

To reduce barriers to adoption and promote its growth, IBM moved DITA into the public domain under the auspices of the Organization for the advancement of Structured Information Standards (OASIS) in 2004. By 2005, the OASIS DITA Technical Committee, consisting of members from across the industry, succeeded in having DITA ratified as an official OASIS standard.

The DITA Standard

DITA version 1.0

The Darwin Information Typing Architecture (DITA) specification defines both a set of document types for authoring and organizing topic-oriented information; and a set of mechanisms for combining and extending document types using a process called specialization.

The specification consists of:

  • The DTDs and schemas that define DITA markup for the base DITA document types, as well as catalog files
  • The language reference that provides explanations for each element in the base DITA document types

For more visit http://docs.oasis-open.org/dita/v1.0/ditaref-type.toc.html

OASIS DITA Technical Committee

OASIS (Organization for the Advancement of Structured Information Standards) is a not-for-profit international consortium that drives the development, convergence, and adoption of e-business standards. Founded in 1993, OASIS has more than 5,000 participants representing over 600 organizations and individual members in 100 countries.

The DITA technical committee was founded4 in March 2004 to refine the Darwin Information Typing Architecture and to promote the use of the architecture for creating standard information types and domain-specific markup vocabularies.

DITA Specification: Information Types
Topic-Concept-Task-Reference

Components

The two primary components of DITA documents are Topics and Maps.

Topics

A topic is a unit of information with a title and content, short enough to be specific to a single subject or answer a single question, but long enough to make sense on its own and be authored as a unit.5 Topics can nest other topics.

Maps

DITA maps are documents that collect and organize references to DITA topics to indicate the relationships among the topics. They can also serve as outlines or tables of contents for DITA deliverables and as build manifests for DITA projects. Like topics, DITA maps can nest other DITA maps.

DITA topics use the file extension .DITA or .XML while DITA Maps use .DITMAP.

The Topic

The Topic-type topic is the root type of all topics in DITA. From the DITA base, topics are specialized into Concept-, Task-, and Reference-types. The root element of a Topic is <topic>, or <dita> if the topic is home to many nested topics.

Structure

All DITA topics must have an ID, a title, and body. Topic structures can consist of the following parts:

Topic element - Required unique ID attribute, contains all other elements.

Title - The subject of the topic. Topic structures may also include alternate titles. Alternate titles may be used to provide different text for navigation or search. When not provided, the base title is used for all contexts.

Short description - A short description of the topic. Used both in topic content and in generated summaries that include the topic.

Prolog - Container for various kinds of topic metadata, such as change history, audience, product, and so on. Not to be confused the XML document prolog.

Body - The actual topic content: paragraphs, lists, sections, and other elements the information type allows. Typical body content can include:

  • Sections and examples - Sections and examples can be contained only by the body of a topic. They cannot nest. They can contain block-level elements like paragraphs, phrase-level elements like API names, or text.

  • Block-level elements - Paragraphs, lists, and tables are kinds of "block" elements as a class of content, they can contain other blocks, phrases, or text, though the rules vary for each structure.

  • Phrases and keywords - Authors can mix markup with text when they need to identify parts of a paragraph or even parts of a sentence as having special significance. Phrases can contain other phrases and keywords as well as text.

  • Images and multimedia - Authors use the image element to insert images as block elements or inline in the text. Authors can create multimedia for online information using the object element.

Related links - Links to other topics. When an author creates a link as part of a topic, the topic becomes dependent on the other topic being available. To reduce dependencies between topics authors can use DITA maps to define and manage links between topics, instead of embedding links directly in each related topic.

Nested topics - Topics can be defined inside other topics. Nesting can result in complex topics that are less reusable and should be used carefully.

Common Attributes

Attributes are broken down into three major areas:

  • Identity Attributes - provide a means for identifying content for retrieval or linking. They include element IDs and content references.

  • Metadata Attributes - provide additional information about the content that can be used to flag, filter, and modify content at run time. These attributes include properties for version, audience, platform, etc.

  • Architectural Attributes - are used to provide a mechanism for specialization. They do not typically appear in the authored document and are instead included in the DTD or schema declarations. These attributes include special properties for class, domain, and namespace.

Other miscellaneous attributes are used for language and output specifications.

Identity Attributes

Element ID

The ID attribute in DITA is used in two different contexts depending upon the type of element it is associated with. When associated with a topic-level element (topic, concept, task, or reference), the ID is unique. It is referred to as a topic ID and is defined as type "ID" in the XML declaration. Elements inside a topic are not defined as type "ID" as they do not need to be unique outside of their resident topic. This type of ID is referred to as an element id. The topic ID and element ID are used together to identify specific instances of elements for linking or reuse.

Content Reference

The DITA conref (content reference) attribute provides a mechanism for reuse of content fragments. The conref attribute stores a reference to another element and is processed to replace the referencing element with the referenced element.

Metadata Attributes

Metadata attributes capture information about the topic or fragment including: (i) importance, (ii) status, and (iii) revision. Other attributes capture applicable user information about the topic or fragment including: (i) audience, (ii) platform, (iii) product, and (iv) props. Any combination of these attributes can be used to control how content is rendered using sheets. Space is used to separate multiple values for each attribute (e.g., platform = "Win2003 LINUX").

Conditional Processing

DITA tries to implement conditional processing in a semantically meaningful way. Rather than allowing arbitrary values to accumulate in a document authors are encouraged to use specific metadata attributes on content. These metadata values can then be leveraged by any number of processes beyond simply filtering conditional content to include flagging, search, and indexing.


End Notes

1 DITA Language Specification: http://docs.oasis-open.org/dita/v1.0/langspec

2 DITA: An XML-based Technical Documentation Authoring and Publishing Architecture, Priestley/Hargis/Carpenter, Society for Technical Communication, Technical Communication, Vol48/No3, Aug 2001.

3 Minimalism Beyond the Nurnberg Funnel, Carroll, Massachusetts Institute of Technology, 1998.

4 OASIS DITA charter.

5 OASIS Darwin Information Typing Architecture (DITA) Architectural Specification v1.0, OASIS Standard, 09 May 2005

Editor's Note: Materials for this guide were developed by SiberLogic, Inc. and are based on their DITA pocket guide. We'll have the next part of "Everything you wanted to know about DITA" in the next issue of DCL News. If you're a subscriber, it will be automatically delivered to your INBOX next month. If you're not a subscriber, you can request a subscription at http://www.dclab.com/request_subscription.asp. If you can't wait; or want the complete version of this guide in a handy desk-top reference format, visit http://www.siberlogic.com/dita_dcl/ to request your copy.

DCLNews Editorial
July 2007

  Structured Product Labeling

Content Reuse

Subscribe

Books2Bytes

DCL Library

Columbia Guide
GSA Schedule
AIA Member
DCL Calendar

Best Practices Santa Fe, NM, September 15-17, 2008. More…
XyUser Phoenix, AZ, September 22-24, 2008. More…
9th Annual Vasont Users' Group Meeting, Hershey, PA, October 6-8, 2008. More…

DITA/TECHCOMM 2008, Raleigh, NC, November 3-6 2008. More…

ATA e-Business Europe. Details TBA.

 
Recent News

Doc Train Life Sciences Indianapolis, IN, June 23-25, 2008. More…

X-Pubs London, England, June 22-24, 2008. More…

Mark Logic User San Francisco, CA, June 10-12, 2008. More…

PTC User Long Beach, CA, June 2-4, 2008. More…

Ultramain User Conference 2008, Albuquerque, NM, May 11-15, 2008. More…

Documentation and Training West 2008 Vancouver, BC, May 6-9, 2008. More…

CMS/DITA Santa Clara, CA, April 7-9, 2008. More…

DIA Med Comm Orlando, FL, March 10-11, 2008. More…

DIA EDM Philadelphia, PA, February 5-7, 2008. More…

Gilbane Boston Conference Boston, MA, November 29, 2007. More…

The LavaCon Conference on Advanced Technical Communication and Project Management New Orleans, LA, October 27-30, 2007. More…

2007 ATA e-Business Forum Miami, Florida, Oct 17-19, 2007. More…

DITA 2007™-East, Raleigh, North Carolina, October 4-6, 2007. More…

2007 XyUser Group Fall Conference, Boston, MA, Sept 23-26, 2007. More…

Mark Logic 2007 User Conference, San Francisco, CA, May 15-17, 2007. More…

Content Management Strategies/DITA North America Conference 2007, Boston, MA, March 26-28, 2007. More…

DIA 18th Annual Workshop, San Diego, CA. March 4-7, 2007. More…

DIA 2007 EDM & CDM Conference, Philadelphia, PA, Feb 6 - 8, 2007. More…

DITA 2007 – West, San Jose, CA, February 5-7, 2007. More…

Framemaker 2006 Chautauqua, Austin, TX, Nov 8-10, 2006. More…

PTC/User World Event 2006, Grapevine, TX, June 4-6. More…

19th Annual DIA Conference Philadelphia, PA, February 7-9. More…

XyUser's Conference, San Diego, California, September 11-14. DCL's Don Bridges delivered a presentation on "Content Reuse" More…

Structured Product Labeling, Washington, DC, August 23-24. More…

Tri-XML 2005, Raleigh, NC , July 28. DCL's Don Bridges delivered a presentation on "Content Reuse" More…

Pharmaceutical Labeling and Product Identification, Whippany, NJ, June 16-17. DCL's Don Bridges delivered a presentation on "Structured Product Labeling (SPL) and the Implications of Implementing an XML Solution." More…

More…

Data Conversion Laboratory, Inc.   61-18 190th St., 2nd Floor, Fresh Meadows, NY 11365   718-357-8700   convert@dclab.com

Copyright © 1997-2008  Data Conversion Laboratory, Inc. All rights reserved.