|
||||
| DCLab.com | About DCL | Tech Info | Press Info | Contact Us | DCLNews | Partners | Wiki | Client Area | ||||
|
Quark to XML Conversion
But if the previous decade belonged primarily to Quark, this one surely belongs to XML. And Quark to XML conversion has become an issue. While Quark remains the desktop package of choice, the Internet and the world of e-commerce are already dictating that mark-up languages, and XML in particular, become de-facto. The Web has revolutionized information delivery, and the publishing industry has had to adapt quickly. Naturally, many people, who've already published books, journals, and technical documentation in Quark, are now looking to convert these documents to XML. Converting Quark to XML presents several challenges. While formats like Interleaf and Framemaker each support rich 'ASCII' formats that accurately represent the entire document, Quark does not. In fact, Quark's native file format is proprietary. And while the software includes a capability to export to a format Quark calls 'Express Tags', you are limited by an ability to export one story at a time. And while there are several commercial plugins that attempt to allow you to export an entire document at one time, getting all the stories out along with accurate graphics information, can still prove difficult. Our experience has shown us that you either get the information out incorrectly, or you don’t get parts of it out at all. Either way, that’s not going to prove acceptable and manual intervention of some kind will be required. But that’s not all you have to contend with. Perhaps the biggest problem with going from Quark to XML is converting tables. When you convert documents to XML, you’ll ideally want to convert all tables in the source document to a table structure (such as HTML or CALS table structure). Unfortunately, the Quark program itself does not include a table editor. This means that in order to simulate tables, many people simply use tabs and frames to achieve that look. This gets the job done for print purposes, but it’s not really a great solution. And it can cause tremendous problems when you decide to put your materials on the web and you need to convert. What it means in terms of conversion, is that you don't necessarily know what’s actually a table, and what is not. What looks like one on the printed page may turn out to be nothing more than a bunch of text separated by tabs, spaces and forced spans. And if the materials were authored or formatted by multiple users at multiple locations (and they frequently are), everyone will have been making their own inconsistent decisions. What you’re left with effectively, is a “house of cards.” And, if you’re building software to help automate a large conversion project, you’re stuck attempting to 'guess' at what is a table, as well as what the structure of that table really is. The result? While logic is king in the world of conversion programming, you’ll end up needing to apply that logic to files that were often formatted without any logic at all! Over the years, DCL has built a suite of filters that help deal with these issues. By analyzing the text and tab structures in the input file, along with the use of specific Quark style names, our process can get us much of the way to where we want to be. Our methodology works quite well for simple and medium tables in Quark, but for things like complicated tables or badly styled materials, we’ve learned to anticipate post-software manual cleanup. It should also be noted that there are plug-ins to Quark (such as Tableworks) that let you build tables within Quark as a true table structure. The advantage is that you’ll end up with more structured Quark files. Unfortunately, these plug-ins still don't allow you to export the table structure, so even in these cases, you’ll end up playing the 'guessing' game. The Last Word? Quark has recently announced a tool (called Avenue) that attempts to export from Quark to XML. I believe that this tool will be usable for very simple documents, or ones that are particularly well styled in the input file. Is it the ultimate solution? Probably not. Based on our experience with conversion from Quark, it’s still likely that tricky conversion features, such as cross-referencing, special characters, and tables, will be hard to do with general purpose tools, and will still need a customized conversion. Michael Gross |
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Data Conversion Laboratory, Inc. 61-18 190th St., 2nd Floor, Fresh Meadows, NY 11365 718-357-8700 convert@dclab.com Copyright © 1997-2008 Data Conversion Laboratory, Inc. All rights reserved. |