DCL/Professional Data and Document Management Not an Impossible Dream

Professional Data and Document Management Is Not an Impossible Dream

Mark Gross, President, Data Conversion Laboratory, appearing in Aerospace Manufacturing and Design

How small and mid-size manufacturers can leverage structured content throughout the enterprise

When faced with thousands, sometimes millions, of pages of data to support manufacturing processes and operations, compliance requirements, and the needs of a complex supply chain, getting that information to the right places efficiently and accurately is difficult.

Green File Folder

Digitizing and efficiently managing large volumes of organizational content is typically thought to be the province of large multi-nationals with massive budgets and armies of support staff, but it doesn’t have to be this way.

Converting and maintaining legacy content doesn’t have to be impossibly complex nor out of reach for the average manufacturer. But, it does require a realistic approach that contains four key elements:

  • Thorough analysis of existing information types, outputs, and workflows
  • Selection of the right standard and conversion framework
  • Focus on normalizing inconsistent data
  • Finding the right partners to help manage the process from end-to-end

Structured content

If your company is like the average manufacturer, you have vast quantities of legacy, unstructured content, stored in both physical and electronic forms across many locations, including file cabinets, bookshelves, servers, individual desktop computers, and mobile devices. Information stored in loose-leaf binders and manila file folders may have some sense of permanence and structure, but it’s hard to store, maintain, or reuse.

During the past decades, organizations moved from printing documentation to word processing to desktop publishing – incorporating processed text and electronic images. At the same time, the storage methods moved from magnetic tapes to floppy disks to CDs and thumb drives, optical storage, and now the cloud.

Dealing with today’s constantly changing regulations, evolving standards, and communications along a global supply chain require digital content that can be repurposed, transmitted, and published according to very specific needs. Structured content enables meeting these challenges by enforcing standardization of data separately from presentation of that data. And that means faster turnaround, fewer errors, and improved data exchange between systems.

Many forms, ways to store them

Companies need to digitize the physical forms, and in the process convert electronic files to structured content conforming to a defined standard, such as XML, independent of both formatting and storage technology, both of which change frequently.

XML (eXtensible Markup Language) provides the means to represent text and other types of content as tagged data, independent of formatting restrictions, and conforms to a predefined standardized structure.

A number of standards have evolved in the last two decades to define the structure of this information in ways that allow it to be transformed into outputs and formats for consumption by the intended audience. Darwin Information Typing Architecture (DITA), and S1000D are just two standards that leverage XML for a variety of industries and are of particular interest for aerospace manufacturing and design.

Organizations in a wide variety of industries are benefiting from converting their legacy content to structured, modular content that can be repurposed for multiple needs, and maintained more cost effectively. Technology companies such as software developers and hardware suppliers use structured content to manage technical documentation, knowledge bases, and training materials. Publishers manage periodicals, books, and citations. Pharmaceutical and medical device manufacturers maintain critical compliance documentation, labeling, and product specifications, as do many large, and not-so-large, aircraft equipment manufacturers and suppliers.

Challenges to Aerospace Firms Chart

Structured content

As complex as the challenges are in maintaining content now, the complexity and costs of digitizing and structuring content can seem even more daunting. However, even companies with limited budgets and small staffs can embark on a conversion effort, as long as they have a thorough plan and the right team to support them. Start with an understanding of the processes.

In past years, one of the difficulties was the unavailability of content management systems suitable for small and mid-size companies. Today, a number of lower-cost alternatives exist that can work for smaller installations. Further, both DITA and S1000D have matured as global standards and no longer need the same level of customization as in years past.

The other stumbling block has been the conversion of a company’s existing information into suitably structured content. So let’s focus on these conversion issues and how small- and mid-size firms can resolve them.

The process

The typical content conversion initiative has two stages, engineering (the process design) and implementation (the actual conversion process). The specifics vary by industry, the business objectives, and the amount and type of content to be converted, but almost all conversion efforts will include some version of the following milestones or phases.

The typical content conversion initiative has two stages, engineering (the process design) and implementation (the actual conversion process). The specifics vary by industry, the business objectives, and the amount and type of content to be converted, but almost all conversion efforts will include some version of the following milestones or phases.


Deep-dive data analysis. Uncover critical information about your existing content. What types of content do you have, how much is mission-critical, and what is the state of the content? Perhaps you maintain a several thousand-page parts catalog online with part numbers that vary by customer. Or, you have volumes of work instructions and training materials that must be updated to reflect change packages or new software implementations. Compliance files may be physical copies for one agency and electronic files for another. It’s critical at this stage to identify what you have and what makes sense to convert.

Conversion specification. Remember, the specifics of your effort will differ greatly from other firms. The specification lays out the framework for how data will be captured, converted, cleaned, and tagged, including which XML standard you will use. Sometimes you may have some modifications to an existing standard to address particular needs. The conversion specification details how precisely information is to be converted to the standard.

Hand-tagged sample. Think of this as an initial proof-of-concept to ensure that the overall conversion process is workable, and that conversion specifications are properly understood. Hand tagging ensures that you address key requirements.

Software tools selection and customization. The conversion effort uses one or more software tools to extract and normalize the data according to the standard selected. Sometimes these tools need to be customized to handle specific conditions or attributes, so plan accordingly, and work with a partner with the expertise and experience that can make a difference.

Proof of production sample. Before moving into full production, you’ll need to plan a pilot that allows you to test conversion of representative content types, such as part numbers and descriptions, or work instructions. This step will test the conversion process on a larger scale and assure that the vast majority of cases are properly handled.

Hot list creation. Based on the results of the data analysis and software development, you’ll create a Hot List of items that may require special attention during the Quality Assurance phase of the production process.

Production ramp-up. The project then enters the production phase and an orderly ramp-up to the planned volume level begins. Once this phase is complete, the infrastructure will be in place to convert large volumes of data in a consistent manner within a very short timeframe. Throughout this process, the vendor should actively solicit and encourage ongoing client feedback, so that needed adjustments can be made quickly and efficiently.

Implementation & production

Low quality data that hasn’t been properly normalized defeats the whole purpose of conversion, so the move to live production should focus on quality control both in the conversion process and in post conversion validation and review.

A well-thought out implementation should accommodate some level of pre-conversion preparation since the cleaner the data at the start, the easier the conversion process becomes.

Document workflow configuration to set up the optimal conversion workflow for each type of content to be converted.

Data capture (from scans, document metadata, etc.)

Pre-tagging content to provide instruction that will improve the conversion.

Automated conversion includes at least three quality check steps:

Validation ensures that the data conforms to the standard in use, to minimize problems with formatting, confirms adherence to the business rules, and that the tags applied are correct.

Editorial review of the content supplies a quality check on the converted content both for accuracy and proper tagging.

Final quality assurance (QA) – a final once-over includes both automated quality assurance and random testing of results.

DCL Software Model chart

Benefits from converting

Aerospace manufacturers of all sizes have invested a lot in their content assets – from the parts list to the engineering schematic – and the benefits of converting to structured content will become evident from the earliest pilot effort:

  • Normalized, standardized data used throughout the enterprise
    • Supply chain interactions and compliance submissions are faster and less error-prone
    • Higher quality and more consistent content that increases customer satisfaction
    • Improved ability to search for relevant content
  • Single source of modular reusable content
    • One piece of content can be reused in multiple outputs
    • One piece of content can be updated once and published to multiple files
    • Logical business rules can restrict or allow display of data according to specific conditions
  • Content management is faster and less expensive
    • Automated workflows allow compiling content into documents quickly
    • Content authoring and publishing resources can be reallocated elsewhere
  • Automated quality control
    • Reduced time to market for content
    • Greater overall content production capacity

Today, even smaller firms can dream big, of eliminating miles of filing cabinets, of timely paperwork submissions, and of useful, relevant content accessible to customers, employees, and supply chain partners on any device in any output. It begins with understanding your content assets, a robust plan for converting them, and the right partner to make the conversion happen.


Read the entire article on Aerospace Manufacturing and Design