Thursday, May 29, 2008

Day 4 of TEI/XML Bootcamp

Day 4 has come and gone. What did I learn? XML is not easy. Programming is even tough business, not for the faint of heart or mind. The main challenge that I had, and made my head spin, was learning the complexities behind XHTML and XSLT. A powerful tool for the construction of the Semantic Web is XHTML. Most people are acquainted with the "meta" tags which can be used to embed metadata about the document as a whole. Yet there are more powerful, granular techniques available too. Although largely unused by web authors, XHTML and XSLT offer numerous facilities for introducing semantic hints into markup to allow machines to infer more about the web page content than just the text. These tools include the "class" attribute, used most often with CSS stylesheets. A strict application of these can allow data to be extracted by a machine from a document intended for human consumption.

Although there have been several proposals for embedding RDF inside HTML pages, the technique of using XSLT transformations has a much broader appeal. Because not everyone is keen to learn RDF, and it thus presents a barrier to the creation of semantically rich web pages. Using XSLT provides a way for web developers to add semantic information with minimal extra effort. Dan Connolly of the W3C has conducted quite a number of experiments in this area, including HyperRDF, which extracts RDF statements from suitably marked-up XHTML pages. What can librarians do?
The Resource Description and Access is just around the corner. And there is much buzz (good and bad) that it's going to change the way librarians and catalogers think about information science and librarianship. I encourage information professionals to be aware of the changes to come. Although most are not going to be involved directly with the Semantic Web, they can keep abreast of developments, particularly exciting developments in information organization and classification. Workshops and presentations about the RDA are out in droves. Pay attention. Stay tuned. There could relevancy in these new developments that spill into the SemWeb.

