Monday, November 12, 2007

New York City In a Semantic Web

Tim Krichel in The Semantic Web and an Introduction to Resource Description Framework makes a very astute analogy for understanding the technology behind the Semantic Web, particularly the nuances of XML and RDF, where the goal is to move away from the present Web - where pages are essentially constructed for use by human consumption - to a Web where more information can be understood and treated by machines. The analogy goes like this:
We fit each car in New York City with a device that lets a reverse geographical position system reads its movements. Suppose, in addition, that another machine can predict the weather or some other phenomenon that impacts traffic. Assume that a third kind of device has the public transport timetables. Then, data from a collaborative knowledge picture of these machines can be used to advise on the best means of transportation for reaching a certain destination within the next few hours.
The computer systems doing the calculations required for the traffic advisory are likely to be controlled by different bodies, such as the city authority or the national weather service. Therefore, there must be a way for software agents to process the information from the machine where it resides, to proceed with further processing of that information to a form in which a software agent of the final user can be used to query the dataset.

Wednesday, November 07, 2007

Genre Searching

At today's SLAIS colloquium, Dr. Luanne Freund gave a presentation on Genre Searching: A Pragmatic Approach to Information Retrieval. Freund argues for taking a pragmatics approach in genre searching and genre classification. But there are two perspectives of pragmatics: socio-pragmatic and cognitive-pragmatic. Using a case study, a high-tech firm, Freund and her colleagues built a unique search engine called X-Cite, which culls together documents from the corporate intranet (which include anything from FAQ's to specialize manuals) with tags. In ranking documents based on title, abstract, and keywords as part of the search engine, the algorithm uniquely cuts down on the ambiguity and guesswork of searching. Using a software engineering workplace domain as its starting point, Freund believes that genre searching has the potential to make a significant contribution to the effectiveness of workplace search systems, by incorporating genre weights into the ranking algorithm.

In genre analysis, three steps must be taken:

(1) Identify - The core genre repertoire of the work domain

(2) Develop - A standard taxonomy to represent it

(3) Develop - Operational definitions of the genre classes in the taxonomy, including identifying features in terms of form, function and content to facilitate manual and automatic genre classification.

Throughout the entire presentation, my mind kept returning to the question: is this not another specialized form of social searching? A tailorized search engine which narrows its search to a specific genre? Although the two are entirely different things, I keep thinking that creating your own search engine is certainly much easier.

Simple Knowledge Organization System (SKOS) & Librarians

Miles and Perez-Aguera's SKOS: Simple Knowledge Organization for the Web introduces SKOS, a Semantic Web language for representing structured vocabularies, including thesauri, classification schemes, subject heading systems, and taxonomies -- tools that cataloguers and librarians use everyday in their line of work.

It's interesting that the very essence of librarianship and cataloging will play a vital role in the upcoming version of the Web. It's hard to fathom how this works: how can MARC records and the DDC have anything to do with the intelligent agents which form the layers of architecture of the Semantic Web and Web 3.0? The answer: metadata.

And even more importantly: the messiness and disorganization of the Web will require information professionals with the techniques and methods to reorganize everything coherently. Web 1.0 and 2.0 were about creating -- but the Semantic Web will be about orderliness and regulating. By controlled structured vocabulary, SKOS is built on the following features. Take a closer look at Miles & Perez-Aguera's article -- it's well worth a read.

(1) Thesauri - Broadly conforming to the ISO 2788:1986 guidelines such as the UK Archival Thesaurus (UKAT, 2004), the General Multilingual Environmental Thesaurus (GEMET), and the Art and Architecture Thesaurus

(2) Classification Schemes - Such the Dewey Decimal Classification (DDC), the Universal Decimal Classification (UDC), and the Bliss Classification (BC2)

(3) Subject Heading Systems - The Library of Congress Subject Headings (LCSH) and the Medical Subject Headings (MeSH)

Friday, November 02, 2007

New Librarians, New Possibilities?

Are newer, incoming librarians changing the profession? Maybe. But not yet. University Affairs has published an article called The New Librarians, which highlights some of the new ideas that newer librarians are bringing into academic libraries. Everyone's favourite University Librarian (at least for me), Jeff Trzeciak, who has his own blog, is featured in the piece, and in it, he describes how he has swiftly hired new Library 2.0-ready librarians as well as overturning the traditional style decor and culture of McMaster Library, with items such as a "café, diner-style booths, stand-up workstations, oversized ottomans, and even coffee tables with pillows on the floor will take their place, all equipped for online access. Interactive touch-screen monitors will line the wall."

University of Guelph Chief Librarian Michael Ridley, similarly sees a future where the university library serves as an “academic town square,” a place that "brings people and ideas together in an ever-bigger and more diffuse campus. Services in the future will include concerts, lectures, art shows – anything that trumpets the joy of learning."

Is this the future of libraries? Yes, it's a matter of time. That's where we're heading -- that's where we'll end up. It is a matter of time. Change is difficult, particularly in larger academic institutions where bureaucracy and politics play an essential role in all aspects of operations. There is great skepticism towards Jeff Trzeciak's drastic changes to McMaster Library -- he's either a pioneer if he succeeds, or an opportunist if he fails. A lot is riding on Jeff's shoulders.

Tuesday, October 30, 2007

Introducing Semantic Searching

Just as we had Google and Web 2.0 nearly figured out, the Semantic Web is just around the corner. Introducing hakia, one of the first truly Semantic Web search engines. As we had argued, the Semantic Web is a digital catalogue, and many of the key components is the understanding of ontologies and taxonomies. Built on Semantic Web technologies, hakia is a new "meaning-based" (semantic) search engine with the purpose of improving search relevancy and interactivity -- the potential benefits for end users are search efficiency, richness of information, and saving time. Here are the elements which makes hakia. Will this hakia team be the next Brin and Page? Why don't you try it?

(1) Ontological Semantics (OntoSem) - A formal and comprehensive linguistic theory of meaning in natural language. As such, it bears significantly on philosophy of language, mathematical logic, and cognitive science

(2) Query Detection and Extraction (QDEX) - A system invented to bypass the limitations of the inverted index approach when dealing with semantically rich data

(3)
SemanticRank algorithm - Deploys a collection of methods to score and rank paragraphs that are retrieved from the QDEX system for a given query. The process includes query analysis, best sentence analysis, and other pertinent operations

(4) Dialogue -
In order establish a human-like dialogue with the user, the dialogue algorithm's goal is to convert the search engine's role into a computerized assistant with advanced communication skills while utilizing the largest amount of information resources in the world.

(5)
Search mission - Google mission was to organize the world's information and make it universally accessible and useful. hakia's mission is to search for better search.