Automating Complex High-Volume Technical Paper and Journal Article Page Composition with NLM XML and InDesign

Automating Complex High-Volume Technical Paper and Journal Article Page Composition with NLM XML and InDesign

Automating Complex Documents with NLM XML and InDesign

Co-Authored by:
SAE International:
Becky Fadik, Business Unit Leader, Content Management
Data Conversion Laboratory:
Mark Gross, President
Brian Trombley, National Sales Director

SAE International organizes and manages industry conferences where thousands of technical papers and journal articles are presented as part of the conference programs. The technical papers and journal articles are reviewed for compliance to SAE publishing requirements and published for print and made available online in a very short time frame. This paper describes how SAE, incorporating automation expertise from Data Conversion Laboratory, evolved the production cycle from a less-than-efficient XSL-FO based process to a highly automated process leveraging NLM XML, XSLT, and Adobe InDesign, resulting in productivity gains and higher quality output. This paper will take you through the evolution of this project to its current success, and talk to future enhancements aimed at driving additional benefits.

Introduction

SAE International is the ultimate knowledge source for mobility engineering, uniting the world’s largest network of engineers, experts, and students to set the standards that move industries forward and foster a lifetime of learning and connecting.

As the vehicle producing industry’s single source for industry-driven, voluntary, consensus-based standards development, SAE’s standards repository totals nearly 35,000 documents.

Supporting a lifetime of learning and connecting, one of SAE’s key building blocks is its conference program. SAE organizes and/or administers more than 27 international meetings and exhibitions each year covering all aspects of technology related to design, manufacture, and service lifecycle for automotive, aerospace, off-highway, and other related mobility industries. Nearly 3,000 technical papers and journal articles are produced from these events.

As a result, SAE offers the world’s largest repository of mobility engineering information. The most up-to-date information available on the latest engineering and technology news and trends for automotive, aerospace, truck and bus, and off-highway vehicles can be accessed whenever and however it is needed. Creating this repository is the result of forward thinking and the need to embrace change.

Strategic Initiative for Change

The last decade was one of profound change which forced our organization to think and act differently. Addressing SAE’s long-term sustainability was vital in a time when the industries we serve were facing overwhelming challenges to remain viable. These challenges, together with SAE’s maturing product portfolio, an outdated publishing model, the accelerating pace of technology, and growing competition in our market space with organizations representing for-profits and start-ups, served as a “call to action.” Short-term sustainability was achieved through cost restructuring and portfolio balancing. Long term sustainability needed to be built on the fundamental cornerstone of SAE – supplying information, tools, and technical know-how to help today’s professionals do their jobs better. SAE embraced this time of change to capitalize on new opportunities to reposition itself for growth by strategically adopting a new approach to managing information.

In July 2008, the SAE International Board of Directors approved funding to develop a systemic approach to acquire, manage, and control the organization’s technical information and intellectual property. Development of a robust content management strategy is transforming how we approach managing and delivering information.

Business Challenges and Goals

The action in 2008 launched SAE’s transition, in 2010, from a print-based publishing model to a digital publishing environment. It became evident that for the transition to be successful, changes to the infrastructure, people, processes, procedures, and product needed to occur.

Key objectives for this initiative:

  1. Cost reduction
  2. Rapid deployment of content
  3. Improvement of output consistency and quality

Successfully implementing a new system with the goal of continually improving production efficiency and publishing quality of SAE technical papers and journal articles set the stage for the significant progress made in the last five years.

Embracing Change

2010 – A New Era

The content management initiative at SAE International was in full swing. Our content management system, custom-built for our specific requirements, propelled us into a new era of publishing. The selected XML tagging system was developed using JATS. Our first publishing tool was selected and the XSL-FO (Extensible Stylesheet Language Formatting Objects) stylesheet was introduced. A new team formed and initial workflows were defined and documented. Author resources, provided to assist authors in understanding the impact this initiative would have on how individual authors prepared and submitted manuscripts, aligned with the new process.

Lessons learned during the first year were many. Over 100 issues were identified. The two most significant quality issues encountered involved excessive white space and placement of figures. Because functionality was limited with both our publishing software and the XSL-FO, we could not manipulate the text and figures for a streamlined look. White space, at times, could be up to half of a column. Large figures (crossing two-column layout) had to be placed at the back of the document, which ultimately led to placing all figures in the back. Quality of the papers suffered and authors as well as customers were not pleased with the result. Obviously, this was not the outcome we expected. In addition, multiple footers were required based on the copyright associated with the paper. Each paper had to be reviewed at the time of layout to determine which stylesheet to use. This added time to the publishing process and opened ourselves up for error. During 2010, all approved manuscript submissions were formatted as technical papers. Our journal selection process did not occur until late in the process. When a technical paper was selected for a journal, the paper had to be reformatted using the journal stylesheet, with the choice of the correct stylesheet again based on the copyright.

2011 – Incremental Change

We implemented several improvements, focused on quality, to the publishing process. Changes were made to the XSL-FO and publishing software upgrades were implemented. Realizing that even an automated publishing process needs some operator intervention, we tackled the most glaring weaknesses first - white space was reduced significantly; the placement of figures was dramatically improved. Since we maintained multiple stylesheets, when we made a change, the change had to be incorporated into all. And, the requirement to publish all papers as technical papers and republishing those selected for journals remained. A positive outcome, from our efforts and learnings, resulted in better instructions to our authors to further improve and enhance the new publishing workflow.

It was also during this time that we recognized that transforming our publishing model from print-based to digital would take time. Although the delivery of our paper product line is mostly electronic, the appearance of the “printed page” is still important to our customers, making our challenge to improve output consistency and quality even more essential.

2012 – Incremental Change

We continued with our XSL-FO stylesheet and the issues inherent with its use. We did find relief when a change to the journal selection process was piloted in early 2012. The change in process allowed us to publish a paper once based on whether it was a technical paper or journal article. This improved our efficiency some, but we still faced the fact that there were multiple stylesheets for each type of paper.

Continuing to recognize the limitations of our publishing software, combined with the chosen tagging of our XML, and the continued focus on a “printed page” appearance, we needed to explore options to provide us with stronger publishing capabilities and greater flexibility to improve our production efficiency and improve the quality of the published document.

Several options were explored to determine the best course of action. We quickly recognized the value of InDesign and the flexibility it would allow us to better control the output. An early lesson learned was the need to use experts for stylesheet development. Stylesheet development requires specific skills. The XSL-FO was developed using internal resources knowledgeable in stylesheet development. This proved to be beneficial initially as changes could be made within a very short period of time, allowing us to keep pace with our production volume. When the in-house resource was redeployed, this created a skill gap, which impacted future changes to our stylesheets as well as any new development needed. Based on this experience, we elected to go outside of our organization for this project. We chose a vendor to develop an XML transformation process as well as a technical paper stylesheet. The vendor selected was an ultimate expert in the field of InDesign use, but not necessarily an expert in stylesheet development.

2013 – Incremental Change → Revolutionary Change

During the first three quarters, we gradually introduced InDesign into our workflow with two of the five production specialists working exclusively with InDesign, with the plan to migrate all specialists to InDesign by year end. This phased approach resulted in maintaining a dual system – InDesign and our original publishing software. This dual system presented several challenges:

  1. Tracking and revising multiple stylesheet and XSL-FO pieces – when a change was made to one, that change had to trickle down to all – and each change required its own set of UAT (user acceptance testing) sessions.
  2. Storing multiple final deliverable source files.
  3. Only technical papers could be published using InDesign. The stylesheet for journal articles was not part of the initial project.
  4. Maintaining a consistent output – the documents published with InDesign were of a much higher quality.

Our initial efforts with InDesign did not yield the expected efficiency gains and cost savings. With our World Congress event, which produces the majority of the papers we acquire annually, our expectation was to reduce direct expenses by 10%, with 90% of our publishing completed by the six-week mark. We achieved a 4% reduction in direct costs, with 73% of our publishing completed within six weeks. A different approach was needed to maximize our use of InDesign. We understood the potential and needed to fully utilize it to meet our goals.

The last three months of 2013 were the most dynamic in our paper-publishing history when we focused on making InDesign our exclusive publishing tool and creating a new workflow. All team members received InDesign training. Learning from previous experience, our vendor search focused on finding an expert in the area of stylesheet development, InDesign, and workflow improvement. Through our vendor selection process, we were introduced to Data Conversion Laboratory, Inc. (DCL). We engaged DCL to partner with us to develop the necessary tools and workflow based on a previous experience with them, recommendations from several of their customers, and a proven track record in the industry.

In November 2013, the real work began. The SAE/DCL team, led by Mandy May on the SAE side and Ed Zeitz on the DCL side, worked closely to initially define and subsequently refine the scope of work to ensure a complete understanding of the project. This initial step proved beneficial as we quickly recognized the value each of us brought to the project. This also served as the first step toward solidifying a very strong partnership.

To kick-start the development, DCL reviewed and analyzed our stylesheet, considering issues we had identified, as potential improvement areas focused on minimizing the number of manual interventions required with our current stylesheet. A key outcome of the project was to automate as much as possible to transform the NLM XML into a completed InDesign document.

Time was of the essence. The tool and improved workflow process needed to be in place in 10 weeks. To meet the tight time frame, an agile development methodology was employed. During the agile development cycle, we experienced evolutionary development and continuous improvement along with rapid and flexible response to change. On a daily basis, the SAE/DCL team worked through development-testing-feedback-development-testing-feedback and on and on and on. SAE’s dedicated resource served as the project manager, monitoring and reporting progress, tracking issues and ensuring resolution, and serving as a focal point for testing and feedback. DCL provided expertise in stylesheet development, InDesign, and workflow improvement. DCL brought new ideas to enhance the process beyond the capabilities that the SAE team had previously envisioned. The key enhancements centered on extending automation to parts of the page building process we had thought would require manual intervention. Some examples include:

  • Managing all text formatting and manipulation within the XSL reduced the need for application of formatting after the InDesign XML import.
  • Consolidating link handling within the context of the AppleScript import script, minimized the user steps needed to create the document; imports, links, footnotes and the "notch box" were invoked with only one step.
  • Incorporation of two documents types with the same template automated the process.

While this upfront design effort was somewhat larger than anticipated, it was well worth it. DCL’s expertise combined the best of both the traditional composition and desktop worlds, allowing us to automate the processing and still return an InDesign file that enables manual WYSWIG interactions when appropriate.

This was a turning point in our realization that we were well on our way to receiving a light-years-ahead process.

2014 – Year Five – Transformation

The tool and workflow developed through a collaborative effort provide the flexibility needed to significantly improve the consistency and quality of the output - spacing, figure placement, hyperlinks, and heading levels. In 2014, we achieved an additional 4% reduction in direct costs, with 89% of our publishing completed by the six-week mark.

Fig. 1 is an example of the XML transformation process and Fig. 2 is an example of an InDesign File created using the new tool/workflow.

Fig. 1 - XML transformation process
Fig. 1 - XML transformation process

 

Fig. 2 - InDesign file created by new process
Fig. 2 - InDesign file created by new process

 

In the process of moving to automated publishing, it’s the elements unique to you that are going to be hard to reconcile. One such component of our technical paper stylesheet is the “notch box” containing the article’s metadata components (see Fig. 3a–3c). This box, considered an essential part of SAE’s brand, has caused difficulty across the organization in maintaining compliance. The key feature we require, not so difficult when an artist drew each box, is that angle of the notch stay constant regardless of how much content is inserted. Considering that the author area may include one, two, three, or more authors, with their affiliations, the need to expand the box without changing the angle is a technical challenge.

Fig. 3a - Notched box – one author
Fig. 3a - Notched box – one author

 

Fig. 3b - Notched box – three authors with two affiliations
Fig. 3b - Notched box – three authors with two affiliations

 

Fig. 3c - Notched box – four authors with three affiliations
Fig. 3c - Notched box – four authors with three affiliations

 

The notch box requires the creation of three cells with consistent borders whose vertical height adjusts to the text in each cell. A standard InDesign table could meet these requirements without the notch specification. The notch substantially complicates the building of the box structure. Proper structure of the outer border should be a continuous path to achieve proper “joins” in the postscript printed output and can only be achieved with the application of a custom corner script.

Since the solution includes a scripted import using the XML filter to make the user set-up consistent, the corner script is applied after the XML import so that the table and box layout are determined and the box vertical sizing set. The script is also available to re-set the notch if the box height changes after import, which will result in a distortion of the notch.

The new process allows us to fully comply with this brand element.

Other major outcomes of the new workflow, which address rapid deployment of content and cost reduction, include:

  1. Batch processing 30 papers at a time. Previously, it was a one-paper-at-a-time workflow.
  2. Inclusion of a final footer, of which there are six variations based on the type of copyright, being automated.
  3. Automated creation of a technical paper format or journal article format, from the output file, with no manual intervention.

Our initiative at the end of 2013 and into early 2014 to significantly improve production efficiency and quality of SAE technical papers and journal articles proved to be highly successful, with 89% of the papers completed at the 6-week mark.

Improve Production Efficiency and Output Quality

Implementation wasn’t at an ideal time. When the tool and workflow process moved from development to production,

  • We were entering the busiest production season of our year,
  • The team was down one position,
  • The team had no actual production experience with the tool and was learning it in real time,
  • The tool had never been tested in full production.

Yet, considering all of this, we significantly decreased the time required to complete the publishing cycle.

Prior to 2014, our publishing cycle was 8 weeks, with the exception of 2011, when it took 9 weeks. Introducing our new tool in 2014, our efficiency increased significantly as we produced 1,477 papers in 7 weeks. For 2015, at the second-week mark, we noticed an increase of almost 50% in production over previous years (Fig. 4).

Fig. 4 - Production efficiency
Fig. 4 - Production efficiency

 

We are experiencing steady growth in the volume of papers while the publishing time frame continues to be shortened. Average production time dropped from an all-time high of 3.8 hours per paper in 2012 to 2.8 hours in 2014 (Fig. 5).

Fig. 5
Fig. 5

 

Next Steps

Technical papers are the product line that we used to pioneer our journey from print to digital publishing. Our goal is to reduce production time by 15% through continued enhancements to the InDesign stylesheet and automating of additional manual steps. The success we achieved together with lessons learned are enabling us to continue our digital transformation. Our next print-to-digital initiative is our industry standards. This project will include XML conversion based on the NLM DTD and use of an InDesign stylesheet. It is comprised of two phases. Currently underway is Phase 1, which involves conversion of all current standards to XML. Phase 2 involves the process for digitizing new/revised standards.

Conclusion

Table 1
Strategic Initiative for Change

2008
  • Content management strategic initiative approved by Board of Directors
2009
  • CMS developed internally
  • Tagging schema created using JATS (NLM DTD V3)
  • Team formed
2010
  • CMS launched using XSL-FO stylesheet
  • 100 quality issues identified - white space and figure placement being the most significant
  • Journal article selection process - 21% of the technical papers published twice
2011
  • XSL-FO and publishing software upgrades
  • Focus on quality - reduction of white space and figure placement
  • Maintaining multiple stylesheets (different footers required based on document type, legacy or new content, and peer review requirement)
2012
  • Process change to journal selections coupled with development of XSL-FO stylesheet for journal papers - publish paper once
  • InDesign stylesheet developed for technical papers - improve efficiency, reduce costs, improve quality
2013 Q1-Q3

 

  • Dual publishing software
  • InDesign stylesheet not developed for journal articles - original XSL-FO stylesheet required
Q4

 

  • Team fully trained in InDesign
  • New stylesheet vendor selected (InDesign and stylesheet development experts)
  • Agile process for stylesheet development
2014
  • Testing completed
  • Tool launched
  • Quality improved, production efficiencies gained

Consistency of the output and quality throughout increased while production time decreased. To achieve this success, the SAE/DCL team demonstrated what can be achieved through collaboration, subject matter expertise, and strong project management; we accomplished a lot in a very short time period. As we look back on the experience, several learnings surface.

The most important aspect is to fully understand the scope of the project. Estimating the amount of work required to complete the project requires very aggressive design and development deadlines. The agile development process employed by DCL was instrumental in managing scope creep. At the end of each, sometimes daily, iteration, new requirements were reviewed and prioritized. This was at the discretion of the SAE/DCL team.

Vendor selection, along with understanding the skills needed, is a key to success. Our first initiative to use InDesign was somewhat successful but not to the degree we needed. We based vendor selection on InDesign experience and not stylesheet development experience. Once we fully realized the benefits to be gained from using InDesign, our vendor search focused on extensive stylesheet development capability coupled with strong InDesign experience.

Know when to start with a clean slate. We attempted to build off of our first InDesign stylesheet and although it was good having some of the basics defined, it may have delayed progress a bit. As we begin looking at additional projects, we are taking a holistic view, focusing on the entire process and not a segment of it.

Be open to new possibilities. We initially approached the project with a particular end game in mind. This quickly changed as new ideas and thoughts challenged our thinking.

Rely heavily on the recommendations from your expert. The gains achieved are ten-fold.

Strong project management, is a prerequisite. This provides an excellent development opportunity for one or more team members to broaden skills associated with project management such as team dynamics, risk assessment, planning, and prioritization.

Teamwork pays dividends. The combined knowledge of the team, SAE and DCL, helped produce a high-quality, efficient tool and workflow. Both organizations brought significant skills to the table, which enhanced the overall results.

Success was achieved through a focused effort with each partner coming to the table representing the best of the best. The partnership between SAE and DCL is one where both parties are fully committed to the overall goal of improving production efficiency and output quality.