DCL  
representational space

   Refer a friend  Email this Page
   Print friendly version Print-Friendly
   Request Information Request Information
   Subscribe  Subscribe

LinkedInTwitterFacebook

representational space
Services
Content Reuse
Document Conversion
Quality Assurance
Rendering & Publishing
SPL Labeling
Source Formats
   - Word Processors
   - Publishing Systems
   - PDF
   - Other Formats
Target Formats
   - XML & SGML
   - DITA
   - Military DTDs
   - NLM
   - Public DTDs
   - S1000D
   - Other Standards
Other Services >>
representational space
Memberships

Quark to XML Conversion

Over the past several years, Quark Xpress has become one of the more popular desktop publishing packages.  This should be no surprise.  Combining ease of use with powerful features, Quark has brought the publishing process to the desktop.

Other Resources

Converting Quark to XML FAQ

XML Resources

 

Other Formats to XML

Converting from PDF to XML

Converting from Adobe PageMaker and InDesign to XML

Converting from Word to XML

Getting Your Content into XML

But if the previous decade belonged primarily to Quark, this one surely belongs to XML.  And Quark to XML conversion has become an issue. While Quark remains the desktop package of choice, the Internet and the world of e-commerce are already dictating that mark-up languages, and XML in particular, become de-facto.

The Web has revolutionized information delivery, and the publishing industry has had to adapt quickly.  Naturally, many people, who've already published books, journals, and technical documentation in Quark, are now looking to convert these documents to XML.  Converting Quark to XML presents several challenges.  While formats like Interleaf and Framemaker each support rich 'ASCII'  formats that accurately represent the entire document, Quark does not.  In fact, Quark's native file format is proprietary.  And while the software includes a capability to export to a format Quark calls 'Express Tags', you are limited by an ability to export one story at a time.  And while there are several commercial plugins that attempt to allow you to export an entire document at one time, getting all the stories out along with accurate graphics information, can still prove difficult.  Our experience has shown us that you either get the information out incorrectly, or you don’t get parts of it out at all.  Either way, that’s not going to prove acceptable and manual intervention of some kind will be required.

But that’s not all you have to contend with.  Perhaps the biggest problem with going from Quark to XML is converting tables.  When you convert documents to XML, you’ll ideally want to convert all tables in the source document to a table structure (such as HTML or CALS table structure).  Unfortunately, the Quark program itself does not include a table editor.  This means that in order to simulate tables, many people simply use tabs and frames to achieve that look.  This gets the job done for print purposes, but it’s not really a great solution.  And it can cause tremendous problems when you decide to put your materials on the web and you need to convert.  What it means in terms of conversion, is that you don't necessarily know what’s actually a table, and what is not.  What looks like one on the printed page may turn out to be nothing more than a bunch of text separated by tabs, spaces and forced spans.  And if the materials were authored or formatted by multiple users at multiple locations (and they frequently are), everyone will have been making their own inconsistent decisions.  What you’re left with effectively, is a “house of cards.”  And, if you’re building software to help automate a large conversion project, you’re stuck attempting to 'guess' at what is a table, as well as what the structure of that table really is.  The result?  While logic is king in the world of conversion programming, you’ll end up needing to apply that logic to files that were often formatted without any logic at all!

Over the years, DCL has built a suite of filters that help deal with these issues.  By analyzing the text and tab structures in the input file, along with the use of specific Quark style names, our process can get us much of the way to where we want to be.  Our methodology works quite well for simple and medium tables in Quark, but for things like complicated tables or badly styled materials, we’ve learned to anticipate post-software manual cleanup.

It should also be noted that there are plug-ins to Quark (such as Tableworks) that let you build tables within Quark as a true table structure.  The advantage is that you’ll end up with more structured Quark files.  Unfortunately, these plug-ins still don't allow you to export the table structure, so even in these cases, you’ll end up playing the 'guessing' game.

The Last Word?

Quark has recently announced a tool (called Avenue) that attempts to export from Quark to XML.  I believe that this tool will be usable for very simple documents, or ones that are particularly well styled in the input file.  Is it the ultimate solution?  Probably not.  Based on our experience with conversion from Quark, it’s still likely that tricky conversion features, such as cross-referencing, special characters, and tables, will be hard to do with general purpose tools, and will still need a customized conversion.

Michael Gross
Director of Research & Development
Data Conversion Laboratory
Phone: 718-357-8700 x 236
Fax: 718-357-8776
mikegross@dclab.com

 
representational space
DCL Library
Articles, fact sheets, presentations and white papers
representational space
Events

Content Management Strategies/DITA North America 2010 Conference,
April 19–21 2010, Santa Clara, California

2010 ATA e-Business Forum,
May 17–19, 2010, Seattle, WA

representational space

representational space
representational space representational space representational space representational space representational space representational space representational space


Corporate office:
61-18 190th Street, 2nd Floor, Fresh Meadows, NY 11365
718-357-8700
Data Conversion Lab
Copyright © 1997-2010  Data Conversion Laboratory, Inc. All rights reserved.