|
INSIGHT INTO
XML
|
NEW
SERIES!
DCLnews talked to leading figures from the world of Scientific,
Technical, & Medical publishing about why STM publishers should
be embracing XML. The answers were illuminating...
This is the first in a series of important interviews.
|
XML
Angel
XML not only makes life easier and more profitable for
STM publishers, it can also be a life-saver, reveals
mark-up expert Debbie Lapeyre in the first of our "Insight into
XML" interviews
DEBBIE
LAPEYRE (pictured) is Vice President of Mulberry
Technologies, Inc, an XML and SGML consultancy that specializes
in information and document analysis, DTD and schema construction, and
design and training. A veteran of the mark-up trenches, she's a recognized
authority on XML, and has developed and taught classes on the business
applications of XML and XML for print production, amongst other subjects.
DCLnews
caught up with her during a brief lull in her busy schedule and asked
why she believes STM publishers should be embracing XML?
"I
think the primary reason is that their mission has changed. In the old days,
STM publishers were print. They used Quark, PageMaker, or internal publishing
systems. They were page driven. But with the advent of PCs and the Internet,
they realized they had to go electronic. And they thought, fine, we
still only have two media to worry about. But the world has changed.
They now have to publish journals and books in multiple electronic formats.
And possibly several versions of print, not to mention CDs..."
DCLnews:
On top of this, when it comes to journals, STM publishers not only have
to publish on their own websites, but also post back issues on archival
websites...
"[Reference]
publishers
catering for emergency rooms and paramedics are thinking that
if they can save as little as 30 or 60 seconds a lookup, they
can save lives..."
|
Debbie
Lapeyre: "Exactly. For journals, there are a number of public
archives. PubMed
Central, for example, is archiving life sciences' journal literature. A coalition of Universities and Foundations is working on a very large project,
which will archive thousands of journals for posterity, because
they don't want the information lost. So transformation has become the
name of the game - far more than it ever was. One thing that makes a
big difference with this is XSLT
(Extensible Stylesheet Language Transformations). It basically makes
it possible to transform from one XML DTD to another. Wiley InterScience's
DTD, for example, can go into the HighWire's, Ovid's, PMC's, or Cadmus'. So
that really opens up the field."
DCLnews:
Another aspect of XML is it allows you to publish time-critical information
fast and efficiently. What's your take on this?
Debbie
Lapeyre:
"The key example is medical reference titles, which are held
in emergency rooms all over the country. Most are dog-eared and covered
with sticky notes from people continually looking things up. Medical
and drug reference publishers are keenly aware that if they can offer electronic form for PCs and handheld devices, they will save medical staff thirty to sixty seconds
a lookup. That could be the difference between life and death. If an
emergency room has a patient who has taken X and Y drugs together, they
can run a search on the XY combination and will have all the available
information within seconds. That could really make a difference if the
patient has stopped breathing."
DCLnews:
That's very powerful - what other benefits are STM publishers getting
from XML?
Debbie
Lapeyre:
"Well, a lot of the big reference publishers are using XML
as a quality assurance tool. It's not saving them money or time. But
it's making it possible to find errors that were either really expensive
to find before, or simply couldn't be found before. Interestingly, they're
using some fairly primitive techniques to do this. In XML and SGML you
can compile a list out of any content that has an element [or tag] surrounding
it. You can list out all the names of the journals in your citation
list, for instance. Then run a program to check that against your authority
file. Or list out the names of all human beings, look down the list,
and if you see something like 'major'... well, it might be a person's
name... but the odds are it's an error. Biology
and medical publishers tag the names of diseases and drugs, or genus
and species. That allows them to run lists of these things. And it doesn't
take a human being much time to scan a list of genus and species names
and spot an error like
a journal title."
DCLnews:
Somebody recently told me that publishers using XML have a checking
system called False Color Proofs - what exactly is that?
Debbie
Lapeyre:
"Again, it's pretty basic and involves adding color to your
text to make it even easier for a person to scan for errors. In the
past, if you were checking reference citations, you'd have to do it
on appearance. Look for the last name, followed by a comma, followed
by initials, followed by a semicolon, followed by the short name of
the journal. A real eye-strainer! But bring each of those things up
on screen in different colors - put the surnames in green, journal titles
in pink, and the year in obnoxious purple - and life is a whole lot
easier! It's astonishing how fast you can scan for errors that way."
DCLnews:
I was reading somewhere that many of the classification systems in science
are changing, due to genome and other developments, and that this has
had a real impact on STM publishing. Could you tell us more about that?
Debbie
Lapeyre:
"Take the field of biology, for example, which is currently
going through a massive classification revolution. There's the old classification
system that works on things like what a creature eats, where it lives,
what it looks like, and its behavior. And then there's the new system
based on either Cladist or genome disciplines. So
there are now alternate ways of classifying things, which is a serious
issue for publishers.
Let's
say you publish a multi-volume reference encyclopedia and you have an
article on mountain lions, detailing where they live, what their behavior
is, and so forth. All that will still be true - they still live on mountains
and look a certain way. The reference material hasn't changed in that
respect. But the creatures they're related to in the fossil record and
what they are related to now - that's all changed due to genome research.
Then we've
got the Cladists telling us that reptiles don't exist; that there is
no way to define a reptile that doesn't take in birds. Because of this,
a lot of journals today talk about 'avian dinosaurs', meaning birds.
But the average encyclopedia still uses the term 'reptile' to talk about
dinosaurs and doesn't include birds in the definition.
So
what has changed is the relationships, not the base information.
Therefore we need multiple ways into the same information in modern
reference works - multiple entrance points. With XML, and a good
DTD or schema, we can show several different virtual Tables
of Contents, so that it looks to the users like the content
is organized in several dynamically different ways, while being
stored only once."
NEXT
MONTH we interview another leading figure in the world of XML and
STM publishing - don't miss it!
DCLnews
Editorial
Read
more XML related articles at DCL
Library
Comments
and correspondence to: DCLnews@dclab.com
Return
to top
|