Friday, November 30, 2007

Digital Libraries in the Semantic Age

Brian Matthews of CCLRC Appleton Laboratory offers some interesting insights in Semantic Web Technologies. In particular, he argues that libraries are increasingly converting themselves to digital libraries. A key aspect for the Digital library is the provision of shared catalogues which can be published and browsed. This requires the use of common metadata to describe the fields of the catalogue (such as author, title, date, and publisher), and common controlled vocabularies to allow subject identifiers to be assigned to publications.

As Matthew proposes, by publishing controlled vocabularies in one place, which can then be accessed by all users across the Web, library catalogues can use the same Web-accessible vocabularies for cataloguing, marking up items with the most relevant terms for the domain of interest. Therefore, search engines can use the same vocabularies in their search to ensure that the most relevant items of information are returned.

The Semantic Web opens up the possibility to take such an approach. It offers open standards that can enable vendor-neutral solutions, with a useful flexibility (allowing structured and semi-structured data, formal and informal descriptions, and an open and extensible architecture) and it helps to support decentralized solutions where that is appropriate. In essence, RDF can be used as this common interchange for catalogue metadata and shared vocabulary, which can then be used by all libraries and search engines across the Web.

But in order to use the Semantic Web to its best effect, metadata needs to be published in RDF formats. There are several initiatives involved with defining metadata standards, and some of them are well known to librarians:

(1) Dublin Core Metadata Initiative

(2) MARC

(3) ONIX

(3) PRISM

No comments: