DCLWiki | Client Area  
DCL  

representational space

   Refer a friend  Email this Page
   Print friendly version Print-Friendly
   Request Information Request Information
   Subscribe  Subscribe

          LinkedInTwitterFacebook

representational space
Services
Content Reuse
Document Conversion
Quality Assurance
Rendering & Publishing
SPL Labeling
Source Formats
   - Word Processors
   - Publishing Systems
   - PDF
   - Other Formats
Target Formats
   - XML & SGML
   - DITA
   - Military DTDs
   - NLM
   - Public DTDs
   - S1000D
   - Other Standards
Other Services »
representational space
Memberships

DITA Conversion: Striving for Success - A Quick Reference

By Noz Urbina, Mekon, Ltd.

This article will cover (briefly!):

  • Choosing your content set pragmatically
  • Issues with Converting to DITA (focus on FrameMaker and Word)
  • Balancing legacy with converted content


Choosing your content set pragmatically

… the migration to DITA has been a gradual one that has not started out with full adoption of all principals of modular authoring.

For those companies we've worked with at Mekon, the migration to DITA has been a gradual one that has not started out with full adoption of all principals of modular authoring. Many will convert their unstructured content to large nested topics that replicate the originals. This is sometimes just 'the way it has to be'.

A lot of people see 'project scoping' as overhead that delays 'production', but it's a classic example of 'measure twice, cut once'

Success often comes from scoping your project properly. A lot of people see 'project scoping' as overhead that delays 'production', but it's a classic example of 'measure twice, cut once'. Large organisations with recent growth and acquisitions should measure several times! Select a first docset that is in every way 'middle ground." i.e. one that is a good example of your most likely documents to be converted. Here are some do's and don'ts to guide you in choosing your first docset.

  • Do NOT use your biggest, baddest, most complex docs. Capture the majority of the content in automation; don't deal with every weird edge case.
  • Do NOT choose 'flagship' products or upcoming 'secret weapons' for your first project. If there are management microscopes on your efforts, even reasonable acceptable setbacks could scuttle your entire initiative.
  • DO choose a representative sampling from various docs/teams to work out metadata and reuse strategies, before converting. DITA implies new top-level organisational structures and metadata which you don't want to have to apply retrospectively. This will also affect a new CMS, style sheets, etc...
  • DO prototype your conversions. Seeing some output always helps perfect the conversion spec.

Issues with Converting to DITA

Inconsistent formats in source documents make conversion processes fail more often than not, thus requiring more manual fixing. Here are some of the issues that my colleagues and I at Mekon have found in the work we have done for clients that might help you assess your situation and more accurately project costs and timelines.

Conversion of existing unstructured documents to anything other than a basic DITA topic is more difficult to achieve due to the restructuring of content that would be required to fit in the specialised DTDs. For example, companies might have something that is essentially a task, but the formatting in the unstructured documents does not match the DTD. For instance there might be a different structural nesting of the content, or it might contain their own very specific formatting, which is meaningful for them but has no convenient DITA equivalent. Thus, removing the formatting without mapping it to a new DITA element would mean loosing metadata.

… we have written "clean up" macros in Microsoft Word to ensure that the source documents comply with a given set of rules.

As a solution to the nesting problem, Mekon has worked on a number of projects where we have written "clean up" macros in Microsoft Word to ensure that the source documents comply with a given set of rules. Conversion success is then much higher and in some cases can be 100%. You should investigate what skills/budget is available for this.

Adobe FrameMaker Conversion Tables have several limitations that mean supporting each specialisation is not possible automatically. They do not support actual scripting logic and have inherent issues like the fact that Context Rules in the Conversion Tables cannot apply to more than one instance. Therefore an unstructured chapter that has a mix of task concept and reference could not be converted into the appropriate DITA DTDs. We have tried several approaches to resolve this without a lot of success, so you'll want to check for this. The best solution found was to map everything to the base DITA elements and apply DITA specialised attributes to those elements.

Balancing Legacy Content with New

In most projects, you will have to convert a portion of your content set, and somewhere, something will be sharing content with your new shiny DITA files / CMS system, but will be in the old format. This is a very tough one to generalise, but as a guideline: the older version is king.

If you have a high level of automation, leverage it.

It seems counter-intuitive to focus on the legacy copy. However, you've got a conversion process that runs in one direction only. Therefore, updating the old source and then reconverting and tweaking as necessary will usually give better results. If the changes are minor (<5-10% depending on size of topics) consider just doing them manually in both the legacy and DITA instance, but obviously this is more error prone. The decision should be made based on how good your conversion process is. If you have a high level of automation, leverage it.

In short, the unfortunate truth is that planning your conversion is always helpful, and should be part of your overall content strategy review. If you've any questions, I'm always happy to discuss!

About the Author

Noz Urbina is Business Development Manager for Mekon, Ltd., where he provides XML solutions consultancy services to global organisations and SMEs. With five years in mark-up technology, training and services, Noz's expertise is brought into projects for requirements analysis and to address issues of human interface design. His main interest area is mastering "the magic nexus" where business goals, end-user sensitivities, and technology must all synergise.

You can learn more from Noz at the upcoming X-Pubs 2008, London, June 22-24, where he will be presenting a workshop on How to Master Taxonomy and Information Architecture.

While you're there see DCL's own Mikhail Vaysbukh present on How to get the Most out of Content Migration to DITA. Get a 15% discount using Discount Code DCLX0108.

DCLnews Editorial
May 2008

 
representational space
DCL Library
Articles, fact sheets, presentations and white papers
representational space
Events

CIDM Best Practices Conference
September 13–15, 2010
Hampton, Virginia

Vasont Users' Group Meeting
September 27–30, 2010
Hershey, Pennsylvania

Internet Librarian Conference
October 25–27, 2010
Monterey, California

Journal Article Tag Suite Conference (JATS-Con)
November 1–2, 2010
Bethesda, Maryland

SPARC Digital Repositories Meeting
November 8–9, 2010
Baltimore, Maryland

More Events »

representational space

News
Brill Again Turns to Data Conversion Laboratory (DCL™) for Key Project


DCL and GeerStreet Announce Strategic Partnership


DCL's “Dan Tonkery on the iPad and the Future of Technical Publications” Published in CIDM News


DCL's “Guide to Conversion Cost Variables” Published in Best Practices Newsletter


DCL's “Dan Tonkery on the iPad and the Future of Technical Publications” Translated on German Blog

More News »


representational space
representational space representational space representational space representational space representational space representational space representational space


Corporate office:
61-18 190th Street, 2nd Floor, Fresh Meadows, NY 11365
718-357-8700
Data Conversion Lab
Copyright © 1997-2010  Data Conversion Laboratory, Inc. All rights reserved.