Transforming Analog Data into Discoverable Structured Data
Fresh Meadows, NY, January 29, 2019 – DCL, an industry leader in structured data and content transformations, and The New York Public Library (NYPL) have completed the initial phase of a project to extract data and organize historical records of the United States Copyright Office, making those records searchable and increasingly accessible.
The first phase of the project—which supports the Library’s mission to make information accessible to all—calls for the extraction and parsing of data contained in the records of hundreds of thousands of pages of mid-20th Century Federal copyright records spanning the time period between 1923 to 1964. While the data is available at GitHub, later phases will make this data accessible to all members of the public via a web-based platform, transforming the laborious, manual process of searching copyright records into a much simpler task.
Each year, millions of people interact with the Library’s content, including databases, online classes and programs, digitized collections items (including manuscripts and photographs), and more. The new addition of data from the US Copyright office will add another element, giving the public the ability to discover content, narrow search results, identify relevant records, and view both machine-readable text and an image of the printed record.
From the Library’s perspective, the extraction of this data also specifically helps the organization track and find copyright data on printed works, giving it the tools to quickly determine how to digitize and make widely accessible an increased number of books and other creative pieces.
“Extracting data from the copyright records is of vital importance to the public and to the copyright industries that make up a significant part of the U.S. economy,” explains Sean Redmond, Senior Product Manager at NYPL. “Creating a searchable and accessible database also benefits the scholarly community interested in various aspects of the creation, production, and ownership of creative works.”
“We’re proud of DCL’s ability to harness leading-edge technology to meet challenging information projects like these,” adds Mark Gross, President at DCL. “By extracting and structuring over a hundred years of unstructured copyright records into an accurate unified database, NYPL will unlock previously buried information to create an important resource for researchers and the public, world-wide, to further facilitate NYPL’s mission.”
DCL is an active member of industry organizations that support the management and effective interchange of data and content. Click here to learn more about content solutions that serve libraries, museums, and universities.
This project was made possible with funding from Arcadia, a charitable fund of Lisbet Rausing and Peter Baldwin, and the Ford Foundation.
DCL (www.dclab.com) provides data and content transformation services and solutions. Using the latest innovations in artificial intelligence, including machine learning and natural language processing, DCL helps businesses organize and structure data and content for modern technologies and platforms. With expertise across many industries including publishing, life sciences, government, manufacturing, technology and professional organizations, DCL uses its advanced technology and U.S.-based project management teams to solve the most complex conversion challenges securely, accurately and on time. Founded in 1981, DCL was named one of EContent's Top 100 Companies in the Digital Content Industry.
About The New York Public Library
The New York Public Library (www.nypl.org) is a free provider of education and information for the people of New York and beyond. With 92 locations—including research and branch libraries—throughout the Bronx, Manhattan, and Staten Island, the Library offers free materials, computer access, classes, exhibitions, programming, and more to everyone from toddlers to scholars, and has seen record numbers of attendance and circulation in recent years. The New York Public Library serves nearly 17 million patrons who come through its doors annually and millions more around the globe who use its resources at nypl.org. To offer this wide array of free programming, The New York Public Library relies on both public and private funding. Learn more about how to support the Library at nypl.org/support.