|
||||
| DCLab.com | About DCL | Tech Info | Press Info | Contact Us | DCLNews | Partners | Wiki | Client Area | ||||
|
By: Don Bridges & Mikhail Vaysbukh Data Conversion Laboratory DITA is a hot topic in the 'Tech Docs' arena, and for good reason. DITA is an open standard that addresses many of the needs of technical documentation producers - most notably content reuse needs. The big question for many companies, once they've determined that authoring in this new standard would be beneficial for them, is what to do with the treasure trove of existing documents, known as legacy documents. Would they be useful converted into DITA, and is it worth the effort? We found that it is indeed worth it, and that documents can be converted at far lower cost than rewriting and re-authoring, getting you moving forward faster. Converting a stack of still valuable "older" documents to your new DITA-based system could give you a big boost in getting started. But you do need to prepare in order for this to be a smooth process. This article discusses five common problems we've seen in the course of doing more than twenty conversions to DITA XML, where we've taken traditionally written documents, reorganized them per DITA rules, incorporated subject matter expertise to assure proper tagging, and created finished DITA XML documents. DITA documents are structured differently than traditionally written documents, and incorporate a lot more tagging than most traditional documents, and also show less tolerance for the creative approaches authors sometime use to solve last minute problems encountered in getting a document out. The potential problems identified below, can either be dealt with before the conversion starts while the writers can work with legacy authoring tools that they are familiar with; or as part of the conversion process using a combination of software and editorial re-writes (if the requirements are clearly defined). Issue 1: Tables That Aren't Tables Much formatting in traditional documents is about looks - how to make the document "look right" to best express the thought the writer is projecting. Therefore table structures are often used to line up information in a particular way and to present a certain look. This is especially true in HTML documents written for the web. In the example "note" below, a table might have been the easiest mechanism to align that light bulb image with the rest of the text.
We find that it's useful to review documents in advance looking for such ambiguities ("is this a table or a note") and either apply an explicit tag or rewrite the text segment as follows: Note: This is an example of a note inside a table. Issue 2: Multiple steps within a single task topic. Some documents may be authored to contain multiple procedures under a single heading. Since documents are usually broken up into DITA Topics at the heading level and the <task> topic does not allow multiple procedures under the same task, documents with multiple procedures under the same heading present additional ambiguity and challenge during conversion to DITA XML. If the original layout and structure are required to be preserved, the first sequence of steps would need to be tagged as a list and the last sequence of steps would be tagged as <steps>. For example in the sample text below the need to use a list to tag the first group of steps could have been avoided if before conversion this section was broken down into two sections: 1) Chain Removal, and 2) Chain Installation. Chain Removal and Installation Before you can install a new chain, you'll need to remove an old one. To do that follow the procedure below:
Next step is to install the new chain rivet:
Issue 3: Task/Procedure authored as a table in the input file Variations of tasks and/or procedures authored as tables in the source document present additional complexity to the conversion process in cases when they need to be deconstructed into <task>s with <step>s since it is not clear, even to a human reader, what order the paragraphs should be read in. This would be better handled if all topics being converted to tasks were authored using a simpler flow. Example 1.
Example 2.
Issue 4: Presence of untitled tasks / topics in the source and referencing only page numbers. In most cases legacy manuals need to be "chunked" (i.e., broken down into smaller segments). For example a typical document might be organized in "Chapters" while DITA will require that they be broken down into smaller topics based on the heading levels. This is normally done based on the existing heading titles in the input document. Often there are "implied topics" that are not explicitly identified, and there are references to page numbers which exist in the print version, but will cease to be useful in the DITA XML output. This all increases the risk that not all topics will be correctly identified and adds ambiguity to resolving cross references to the untitled topics. For example in the text you may have something like:
On page 121 you would find an untitled task like below:
This case would be better handled if the task on page 121 had been titled and referenced by a title rather than a page number. Issue 5: Having more than two levels of steps. DITA only allows two levels of steps (<step> and <substep> below it), so when source data has more step levels it's better handled if the source is re-authored to keep the number of step levels to a maximum of two. The best approach to re-authoring this kind of material depends on the individual case - possible options include using bulleted lists below the second level or re-authoring text to remove one level.
Wrap-up
As is true for any standardized approach, moving to DITA XML requires change in the organization in many ways, but if you determine that it's worth it for your organization, then conversion of your legacy documents can give you a big head start at a cost much lower than re-authoring - but doing it right requires that you carefully review your documents in advance to make the conversion process as smooth as possible.
DCLNews Editorial
|
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Data Conversion Laboratory, Inc. 61-18 190th St., 2nd Floor, Fresh Meadows, NY 11365 718-357-8700 convert@dclab.com Copyright © 1997-2008 Data Conversion Laboratory, Inc. All rights reserved. |