Adjust type size
Larger | Smaller


Karen Vagts: XML Portfolio

[ XML Course Web Page ] [ XML Portfolio Home ]
[ XML Resources ]

Assignment 7

Write a paper detailing XML applications in some aspect of libraries.   Details in class, but the paper must be extremely well-written, no longer than 5 pages.  See the class management page. Due 4 April 2007.

For this assignment, we are to write a "white paper" describing the benefits of XML as applied to the focus we first addressed in Assignment 1 (see the XML Resources page). Accordingly, I discussed geo-spatial applications of XML, as shown here.

The Benefits of XML for Geospatial Data Management in LIS Applications

The digital era has moved maps and related geospatial data from dusty, often neglected flat files and tubes in the basements of libraries to the forefront of information collections.  Digital libraries and archives, geographical information systems (GIS), and web sites such as Google Earth, Google Maps Mashups, and Mapquest have made maps accessible, informative, and fun to a large audience.  New technologies enable dynamic linking of previously separate geospatial sources – maps, gazetteers, geospatial taxonomies and vocabularies, statistics, and histories of places. At the same time, many of the global challenges we face – environmental changes, natural disasters, political and military conflicts, among others – have made us more aware of, and curious about, the planet we inhabit; maps help us understand and analyze these challenges.

Reflecting their pre-digital roots, much geospatial data has evolved within separate containers or “silos,” limiting its ability to be shared or combined with other sources.  Such silos include static web pages, graphic files, databases, GIS projects, geographically-oriented thesauri and authorities, and text documents.1

 

Enabling data exchange, or interoperability, among these sources requires an exchange mechanism.  One of the best and most easily accessible such mechanisms is Extensible Markup Language (XML).

Over the past decade, various XML-based geospatial applications have emerged and, with XML vocabularies or schemas such as Google’s Keyhole Markup Language (KML) and the Open Geospatial Consortium’s Geography Markup Language (GML), more are being created daily.  This paper discusses three uses of geospatial-based XML for LIS applications:

XML-based Data Exchange

A primary purpose of XML-based applications is to exchange data among multiple sources.  Such interoperability can be achieved using an XML data file in conjunction with other technologies, whether XML-based standards such as Extensible stylesheet language transformation (XLST) and schemas, to generic (non-XML) tools such as Java, JavaScript, and PHP as well as industry-focused applications such as MARC-based integrated library systems (ILS) or GIS software.

In order for data exchange to work, the various data sources involved need an exchange “hook, ” a set of common elements that can serve as a bridge in the same manner that a book’s International Standard Book Number (ISBN) can link a MARC-based catalog record to a publisher’s website.  In geospatial applications, possible common hooks include Earth coordinate systems, such latitude and longitude or the Universal Transverse Mercator (UTM) systems.  Provided the sources in question reference the same coordinate system, they can grab data from a variety of sources such as the Getty Vocabularies Geographic Thesaurus, the United States Geological Survey (USGS),  the U.S. Census, and bibliographic records created in the MARC or Dublin Core formats.  Other possible hooks include place names (although these are more variable), USGS geographical feature IDs, and geographic country codes developed by such authorities as the International Standards Organization (ISO).

Geospatial XML-based data exchange occurs in various ways.  One approach is to populate a new or empty field in an existing data set with the contents of another.  Another is consolidation, to create a totally new data set comprised of data from multiple sources.  Such exchanges can be one-time affairs, as when updating a library catalog to reflect a permanent country name change in bibliographic records, or a continual, dynamic process.

XML-Based Data Transformation

In addition to basic data exchange among geospatial sources, XML can be used to transform data, to convert data values to reflect different needs and requirements.  A typical use, building on the above example of geographic coordinates, is to convert location data stored into one format into another, such as switching location coordinates from degrees-minutes-seconds (Lat: 42 21 00 N, Long: 071 03 00 W) to decimal degrees (Lat: 42.3500, Long: -71.0500) and vice versa.  Transformations also can be used to cross-walk maps using the contemporary and predominant Prime Meridian system (0° Longitude at Greenwich, England) to other coordinate systems, as is required with location points in older maps and gazetteers.  XML’s transformative capabilities can be applied to reconciling different map projection systems (Mercator, Peters, etc.), map scales, and units of measurement (metric versus imperial units).  Conversions and other calculations on data stored within an XML data sheet can be done either with other XML-based technologies such as XSLT and Xpath, (and eventually, XQuery) or with non-XML technologies that can handle mathematical calculations.  XML can assist as a transfer mechanism for these processes or serve as a final repository where converted data is stored.

Another, somewhat thornier, example of transformation involves place names.  These are more subjective than mathematical data, reflecting factors ranging from political controversies and historical developments (Burma versus Miramar, Taiwan versus Republic of China or even Formosa, Persia versus Iran) to variations in language and spelling.  XML transformations can address these disputes in various ways.  One would be to do an absolute replacement of one term with another, for example, replacing Sri Lanka with Ceylon in older OPAC records.  Another, frequently more diplomatic, approach is to link, group, and cross reference competing names and their variants.  Examples of this approach are seen in the Getty Geographic Names Thesaurus and the Alexandria Library Gazetteer, both of which employ XML.

Data to graphics links

XML-based geospatial applications needn’t be confined to text-based data.  Graphics – the essence of cartography – can also be involved and in several ways.
One approach is to use XML’s own graphic vocabulary, Scaleable Vector Graphics (SVG), a W3C standard for vector-based images.  SVG is a non-proprietary, text-based file format containing not only definitions for graphic shapes created within the file but also elements that other XML-based data sources can connect to.  A map of the United States of America, for example, can contain separate SVG elements for all 50 states, each of which contains attributes or child elements of geographic location coordinates, state names or abbreviations, U.S. Census TIGER data, or another “hook” that can latch onto an external data source and be programmed to permit interactivity.  The National Institute of Cancer Interactive Mortality Charts and Graphs (http://www3.cancer.gov/atlasplus/charts508.html) and the Clare County of Ireland Historical Maps (http://www.clarelibrary.ie/eolas/coclare/maps/index.htm) websites show examples of SVG graphics linked to data tables.

Although SVGs, being XML-based, are a logical XML approach to linking graphics to other data, other Vector-based tools, such as GIS systems (ESRI ArcInfo, CAD systems) as well as Adobe (formerly Macromedia) Flash can also enable links between graphic elements and XML-based data.

Summary

The three XML applications described above are just a sampling of the possibilities that XML offers LIS-oriented geospatial applications, especially as the XML family of applications is constantly evolving.  For LIS professionals managing geospatial data, XML provides the means to extract spatial information from isolated data silos for deployment in infinite applications,  in the process providing new ways to study our planet.

1 In the past decade, terms like geospatial data and geospatial data management have begun to replace the traditional LIS terms of  maps and map librarianship  Geospatial data reflects contemporary map production, which is achieved both with relatively simple online tools such as Google Maps and with complicated proprietary GIS software such as ESRI ArcInfo.  Such tools create maps on the fly, combining disparate sources such as vector shapes, data tables, and satellite images to produce multiple output formats, from classic choropleth maps printed at a set scale and paper size to resizable and interactive maps for display in web pages, in GPS-enabled handheld devices, multimedia, and many other forms of digital output.