|
|
INSIGHT INTO XML
XML
Angel
DCLnews caught up with her during a brief lull in her busy schedule and asked why she believes STM publishers should be embracing XML? "I think the primary reason is that their mission has changed. In the old days, STM publishers were print. They used Quark, PageMaker, or internal publishing systems. They were page driven. But with the advent of PCs and the Internet, they realized they had to go electronic. And they thought, fine, we still only have two media to worry about. But the world has changed. They now have to publish journals and books in multiple electronic formats. And possibly several versions of print, not to mention CDs..." DCLnews: On top of this, when it comes to journals, STM publishers not only have to publish on their own websites, but also post back issues on archival websites...
Debbie
Lapeyre: "Exactly. For journals, there are a number of public
archives. PubMed
Central, for example, is archiving life sciences' journal literature. A coalition of Universities and Foundations is working on a very large project,
which will archive thousands of journals for posterity, because
they don't want the information lost. So transformation has become the
name of the game - far more than it ever was. One thing that makes a
big difference with this is DCLnews: Another aspect of XML is it allows you to publish time-critical information fast and efficiently. What's your take on this? Debbie Lapeyre: "The key example is medical reference titles, which are held in emergency rooms all over the country. Most are dog-eared and covered with sticky notes from people continually looking things up. Medical and drug reference publishers are keenly aware that if they can offer electronic form for PCs and handheld devices, they will save medical staff thirty to sixty seconds a lookup. That could be the difference between life and death. If an emergency room has a patient who has taken X and Y drugs together, they can run a search on the XY combination and will have all the available information within seconds. That could really make a difference if the patient has stopped breathing." DCLnews: That's very powerful - what other benefits are STM publishers getting from XML? Debbie Lapeyre: "Well, a lot of the big reference publishers are using XML as a quality assurance tool. It's not saving them money or time. But it's making it possible to find errors that were either really expensive to find before, or simply couldn't be found before. Interestingly, they're using some fairly primitive techniques to do this. In XML and SGML you can compile a list out of any content that has an element [or tag] surrounding it. You can list out all the names of the journals in your citation list, for instance. Then run a program to check that against your authority file. Or list out the names of all human beings, look down the list, and if you see something like 'major'... well, it might be a person's name... but the odds are it's an error. Biology and medical publishers tag the names of diseases and drugs, or genus and species. That allows them to run lists of these things. And it doesn't take a human being much time to scan a list of genus and species names and spot an error like a journal title." DCLnews: Somebody recently told me that publishers using XML have a checking system called False Color Proofs - what exactly is that? Debbie Lapeyre: "Again, it's pretty basic and involves adding color to your text to make it even easier for a person to scan for errors. In the past, if you were checking reference citations, you'd have to do it on appearance. Look for the last name, followed by a comma, followed by initials, followed by a semicolon, followed by the short name of the journal. A real eye-strainer! But bring each of those things up on screen in different colors - put the surnames in green, journal titles in pink, and the year in obnoxious purple - and life is a whole lot easier! It's astonishing how fast you can scan for errors that way." DCLnews: I was reading somewhere that many of the classification systems in science are changing, due to genome and other developments, and that this has had a real impact on STM publishing. Could you tell us more about that? Debbie Lapeyre: "Take the field of biology, for example, which is currently going through a massive classification revolution. There's the old classification system that works on things like what a creature eats, where it lives, what it looks like, and its behavior. And then there's the new system based on either Cladist or genome disciplines. So there are now alternate ways of classifying things, which is a serious issue for publishers. Let's say you publish a multi-volume reference encyclopedia and you have an article on mountain lions, detailing where they live, what their behavior is, and so forth. All that will still be true - they still live on mountains and look a certain way. The reference material hasn't changed in that respect. But the creatures they're related to in the fossil record and what they are related to now - that's all changed due to genome research. Then we've got the Cladists telling us that reptiles don't exist; that there is no way to define a reptile that doesn't take in birds. Because of this, a lot of journals today talk about 'avian dinosaurs', meaning birds. But the average encyclopedia still uses the term 'reptile' to talk about dinosaurs and doesn't include birds in the definition. So what has changed is the relationships, not the base information. Therefore we need multiple ways into the same information in modern reference works - multiple entrance points. With XML, and a good DTD or schema, we can show several different virtual Tables of Contents, so that it looks to the users like the content is organized in several dynamically different ways, while being stored only once." NEXT MONTH we interview another leading figure in the world of XML and STM publishing - don't miss it! DCLnews Editorial
|
|
|
|
|
|
|
|
|
|
| |||||||||||||||||||||||||||||||||||||||||||||