Mark Gross, President, Data Conversion Laboratory, appearing in Information Today
Publishers in all industries, especially those delivering scientific, technology, and medical information and products, feel the pressure from customers to make content available anytime, anywhere, on any device. Publishers need to manage that pressure with a content strategy to develop materials from scratch, but should also focus on a significant alternative: extracting extraordinary value from legacy content that exists in a variety of formats and locations.
It begins simply enough. Books, reports, papers, articles, and images are created to serve a purpose, with various tools and stored in various places. Very little emphasis is on the structure and processes to store and maintain it. Before long, these collections of content grow to a mountainous size with no visible means to scale it. Those organizations that can make the leap into innovative technologies can mine new value from legacy material.
However, some publishers mistakenly assume that if their content is already in an XML format, the conversion process will be easy and doable on their own. But even XML can be outdated or inconsistently produced, which introduces a challenge while preparing for the conversion. A prime example – the Optical Society of America (OSA) decided it was time to convert its legacy journal content to digital formats—all the way back to volume one, issue one.
Going Back in Time for a Reusable Future
One might ask, Why convert the entire library? Is there a need to go back to journals articles from 1917? OSA wanted to offer members, researchers, and others around the world who read and cite the society's legacy journal material more robust, technologically savvy product offerings. They partnered with DCL because we have years of experience working with a wide variety of XML standards and a wide variety of source formats, and of tackling conversion efforts with in-depth analysis, extensive planning, and strong quality control.
For OSA, DCL identified some quick wins with materials that were already in XML, but in an older XML version, and thus ready to map to NLM 3.0 DTD, a publishing standard developed by the U.S National Library of Medicine. And then we tackled the harder materials. DCL and OSA worked together to incorporate the OSA-provided rules into DCL's conversion software, which accurately cleans up and normalizes content in the course of conversion. The process allowed OSA to be build a series of derivative products such as the Optics Image Bank which provides a visual search across all figures and images, even by the context of the figure caption, in-text reference, or related images.
The results of this project are emblematic of why converting to XML is an efficient means to repurpose and reuse dormant content: Within 3 years, a hundred years' worth of content had been converted into multiple well-received new offerings on the Optics InfoBase platform.
Learning from the Best: Key Benefits You Can Implement, Too
OSA implemented a highly successful content conversion effort, and it can serve as a guide to other organizations that recognize the value in legacy content, the increasing demands of users to access high-quality content on any device, and the complexities of managing large-scale conversions.
The audiences of most publishers still want to access materials from:
- databases full of citations, abstracts, and references
- printed catalogs, forms, newsletters, and journals containing images of parts or graphs, and text-based descriptions of products and services
- online servers full of videos, process documentation, financial information, and employee records
- learning management systems (LMS) containing online and instructor-led training materials in web and presentation files
Fortunately, with modern technologies, you can reengineer this content to create new assets from the same source material. Here’s a look at some of the key benefits you gain by producing innovative content products.
Deliver High-Quality Content
The goal to deliver high-quality content is not (or should not be) new to anyone, but an innovative approach to content, brings old, yet still valuable, legacy content into modern contexts, with quality levels higher than managers expect. Automation software does much of the heavy lifting, so that resources spend less time on manual quality assurance and ensuring errors are not added by individuals in a manual intervention process.
Replicate the Solution
A phased approach applies upfront analysis, automation, and iterative corrections to the process for converting legacy content. Then you can harness it to process, update, create, and publish all types of content across all kinds of channels. Such implementation lets you take the content from any format to another format without trapping it in a specific software application or tied to formatting elements for one type of publication output.
Ease the Authoring Process
The strategy for converting content also filters into the content authoring environment, streamlining the processes to create high-quality content from the start. Apply standards and best practices from the initial conversion to author, store, edit, and publish content with a better understanding of workflow, consistency, and standards.
No Down Time
With the increasing market for mobile devices and users demanding consistently positive experiences with the content across multiple devices, organizations are challenged to make the content available everywhere all the time. Incorporating responsive web designs and HTML5 into content processing speeds the compilation of legacy content into new assets. Automatic conversion of the content helps organize the collections and prepare materials to be published to multiple channels.
Mark Gross, President of Data Conversion Laboratory (DCL), is a recognized authority on XML implementation and document conversion. Prior to joining DCL in 1981, Mark was with the consulting practice of Arthur Young & Co. Mark has a BS in Engineering from Columbia University and an MBA from New York University. He has also taught at the New York University Graduate School of Business, the New School, and Pace University. He is a frequent speaker and writer on the topic of automated conversions to XML.