DCLWiki | Client Area  
DCL  

representational space

   Refer a friend  Email this Page
   Print friendly version Print-Friendly
   Request Information Request Information
   Subscribe  Subscribe

          LinkedInTwitterFacebook

representational space
Services
Content Reuse
Document Conversion
Quality Assurance
Rendering & Publishing
SPL Labeling
Source Formats
   - Word Processors
   - Publishing Systems
   - PDF
   - Other Formats
Target Formats
   - XML & SGML
   - DITA
   - Military DTDs
   - NLM
   - Public DTDs
   - S1000D
   - Other Standards
Other Services »
representational space
Memberships

Converting to DITA

Everything you wanted to know (but were afraid to ask)(Part 1)

To get your handy DITA pocket guide, click here

Editor's Note: While many in the technical communications field are very familiar with DITA, we suspect that there are many who are not up to the task, and could use a refresher. This is the first part of a two-part series on the basics of DITA. (With thanks to SiberLogic, Inc, who prepared materials for this reference documentation.)

Introduction to DITA: From GML (Generalized Markup Language) to DITA

The evolution of structured authoring

Purpose

This DITA Guide is intended to serve as an easy-to-use reference to help improve your understanding of Darwin Information Typing Architecture (DITA). It is not a definitive reference on the subject.

Audience

This guide is targeted at technical communicators and information architects getting started with DITA. It assumes a passing familiarity with XML.

Version

This guide is current as of version 1.0 of the OASIS DITA standard1.

What is DITA?

DITA Defined

DITA (Darwin Information Typing Architecture) is a topic-based XML framework used to support the authoring, management, and publication of technical documentation2.

What's in a name?

Darwin: Applies theories of evolution through specialization and inheritance to information types.

Information Typing: Central to DITA is its focus on authoring based upon classified units of information.

Architecture: Goes beyond structured XML to capture best practices for both extensive design and process.

Single Source Publishing with XML

What is XML (eXtensible Markup Language)

XML is a W3C-recommended general-purpose markup language capable of describing many different kinds of data. Its primary purpose is to facilitate the sharing of data across different, potentially disparate systems.

Advantages of using XML for structured content

XML allows authors to separate form from content allowing it to be reused across a wider range of information products. By applying specific markup, information can be retrieved and processed. Using a non-proprietary format also reduces costs.

Challenges of using XML

Prior to DITA, authors were unable to take the most advantage of these benefits. Markup was either too specific to be shared or too generic to be truly useful.

The DITA Solution

DITA introduces several innovations for making XML more useful for authors.

Rich element tagging

Content can be tagged by information type (procedure) instead of by format (list).

Extensible XML for organizations

Specialization makes is possible to share content across different domains. In fact all organizations subscribing to DITA are able to share and access content.

Better opportunities for reuse

DITA's topic-based orientation and reference mechanisms make reuse easier.

Captures best practices for Technical Communication

DITA was designed for technical publications and employs concepts such as Information Mapping® and Minimalism3 in its architecture.

Brief history of DITA

The DITA architecture and DTD were designed by a cross-company workgroup representing user assistance teams from across IBM through 1999-2000. By 2001, DITA was hitting the streets across North America.

To reduce barriers to adoption and promote its growth, IBM moved DITA into the public domain under the auspices of the Organization for the advancement of Structured Information Standards (OASIS) in 2004. By 2005, the OASIS DITA Technical Committee, consisting of members from across the industry, succeeded in having DITA ratified as an official OASIS standard.

The DITA Standard

DITA version 1.0

The Darwin Information Typing Architecture (DITA) specification defines both a set of document types for authoring and organizing topic-oriented information; and a set of mechanisms for combining and extending document types using a process called specialization.

The specification consists of:

  • The DTDs and schemas that define DITA markup for the base DITA document types, as well as catalog files
  • The language reference that provides explanations for each element in the base DITA document types

For more visit http://docs.oasis-open.org/dita/v1.0/ditaref-type.toc.html

OASIS DITA Technical Committee

OASIS (Organization for the Advancement of Structured Information Standards) is a not-for-profit international consortium that drives the development, convergence, and adoption of e-business standards. Founded in 1993, OASIS has more than 5,000 participants representing over 600 organizations and individual members in 100 countries.

The DITA technical committee was founded4 in March 2004 to refine the Darwin Information Typing Architecture and to promote the use of the architecture for creating standard information types and domain-specific markup vocabularies.

DITA Specification: Information Types
Topic-Concept-Task-Reference

Components

The two primary components of DITA documents are Topics and Maps.

Topics

A topic is a unit of information with a title and content, short enough to be specific to a single subject or answer a single question, but long enough to make sense on its own and be authored as a unit.5 Topics can nest other topics.

Maps

DITA maps are documents that collect and organize references to DITA topics to indicate the relationships among the topics. They can also serve as outlines or tables of contents for DITA deliverables and as build manifests for DITA projects. Like topics, DITA maps can nest other DITA maps.

DITA topics use the file extension .DITA or .XML while DITA Maps use .DITMAP.

The Topic

The Topic-type topic is the root type of all topics in DITA. From the DITA base, topics are specialized into Concept-, Task-, and Reference-types. The root element of a Topic is <topic>, or <dita> if the topic is home to many nested topics.

Structure

All DITA topics must have an ID, a title, and body. Topic structures can consist of the following parts:

Topic element - Required unique ID attribute, contains all other elements.

Title - The subject of the topic. Topic structures may also include alternate titles. Alternate titles may be used to provide different text for navigation or search. When not provided, the base title is used for all contexts.

Short description - A short description of the topic. Used both in topic content and in generated summaries that include the topic.

Prolog - Container for various kinds of topic metadata, such as change history, audience, product, and so on. Not to be confused the XML document prolog.

Body - The actual topic content: paragraphs, lists, sections, and other elements the information type allows. Typical body content can include:

  • Sections and examples - Sections and examples can be contained only by the body of a topic. They cannot nest. They can contain block-level elements like paragraphs, phrase-level elements like API names, or text.

  • Block-level elements - Paragraphs, lists, and tables are kinds of "block" elements as a class of content, they can contain other blocks, phrases, or text, though the rules vary for each structure.

  • Phrases and keywords - Authors can mix markup with text when they need to identify parts of a paragraph or even parts of a sentence as having special significance. Phrases can contain other phrases and keywords as well as text.

  • Images and multimedia - Authors use the image element to insert images as block elements or inline in the text. Authors can create multimedia for online information using the object element.

Related links - Links to other topics. When an author creates a link as part of a topic, the topic becomes dependent on the other topic being available. To reduce dependencies between topics authors can use DITA maps to define and manage links between topics, instead of embedding links directly in each related topic.

Nested topics - Topics can be defined inside other topics. Nesting can result in complex topics that are less reusable and should be used carefully.

Common Attributes

Attributes are broken down into three major areas:

  • Identity Attributes - provide a means for identifying content for retrieval or linking. They include element IDs and content references.

  • Metadata Attributes - provide additional information about the content that can be used to flag, filter, and modify content at run time. These attributes include properties for version, audience, platform, etc.

  • Architectural Attributes - are used to provide a mechanism for specialization. They do not typically appear in the authored document and are instead included in the DTD or schema declarations. These attributes include special properties for class, domain, and namespace.

Other miscellaneous attributes are used for language and output specifications.

Identity Attributes

Element ID

The ID attribute in DITA is used in two different contexts depending upon the type of element it is associated with. When associated with a topic-level element (topic, concept, task, or reference), the ID is unique. It is referred to as a topic ID and is defined as type "ID" in the XML declaration. Elements inside a topic are not defined as type "ID" as they do not need to be unique outside of their resident topic. This type of ID is referred to as an element id. The topic ID and element ID are used together to identify specific instances of elements for linking or reuse.

Content Reference

The DITA conref (content reference) attribute provides a mechanism for reuse of content fragments. The conref attribute stores a reference to another element and is processed to replace the referencing element with the referenced element.

Metadata Attributes

Metadata attributes capture information about the topic or fragment including: (i) importance, (ii) status, and (iii) revision. Other attributes capture applicable user information about the topic or fragment including: (i) audience, (ii) platform, (iii) product, and (iv) props. Any combination of these attributes can be used to control how content is rendered using sheets. Space is used to separate multiple values for each attribute (e.g., platform = "Win2003 LINUX").

Conditional Processing

DITA tries to implement conditional processing in a semantically meaningful way. Rather than allowing arbitrary values to accumulate in a document authors are encouraged to use specific metadata attributes on content. These metadata values can then be leveraged by any number of processes beyond simply filtering conditional content to include flagging, search, and indexing.


End Notes

1 DITA Language Specification: http://docs.oasis-open.org/dita/v1.0/langspec

2 DITA: An XML-based Technical Documentation Authoring and Publishing Architecture, Priestley/Hargis/Carpenter, Society for Technical Communication, Technical Communication, Vol48/No3, Aug 2001.

3 Minimalism Beyond the Nurnberg Funnel, Carroll, Massachusetts Institute of Technology, 1998.

4 OASIS DITA charter.

5 OASIS Darwin Information Typing Architecture (DITA) Architectural Specification v1.0, OASIS Standard, 09 May 2005

Editor's Note: Materials for this guide were developed by SiberLogic, Inc. and are based on their DITA pocket guide. We'll have the next part of "Everything you wanted to know about DITA" in the next issue of DCLnews. If you're a subscriber, it will be automatically delivered to your INBOX next month. If you're not a subscriber, you can request a subscription at http://www.dclab.com/request_subscription.asp. If you can't wait; or want the complete version of this guide in a handy desk-top reference format, visit http://www.siberlogic.com/dita_dcl/ to request your copy.

DCLnews Editorial
July 2007

 
representational space
DCL Library
Articles, fact sheets, presentations and white papers
representational space
Events

CIDM Best Practices Conference
September 13–15, 2010
Hampton, Virginia

Vasont Users' Group Meeting
September 27–30, 2010
Hershey, Pennsylvania

Internet Librarian Conference
October 25–27, 2010
Monterey, California

Journal Article Tag Suite Conference (JATS-Con)
November 1–2, 2010
Bethesda, Maryland

SPARC Digital Repositories Meeting
November 8–9, 2010
Baltimore, Maryland

More Events »

representational space

News
Brill Again Turns to Data Conversion Laboratory (DCL™) for Key Project


DCL and GeerStreet Announce Strategic Partnership


DCL's “Dan Tonkery on the iPad and the Future of Technical Publications” Published in CIDM News


DCL's “Guide to Conversion Cost Variables” Published in Best Practices Newsletter


DCL's “Dan Tonkery on the iPad and the Future of Technical Publications” Translated on German Blog

More News »


representational space
representational space representational space representational space representational space representational space representational space representational space


Corporate office:
61-18 190th Street, 2nd Floor, Fresh Meadows, NY 11365
718-357-8700
Data Conversion Lab
Copyright © 1997-2010  Data Conversion Laboratory, Inc. All rights reserved.