Wednesday, November 21, 2007
Postmodern Librarian - Part Two
In my opinion, perhaps this is where Web 2.0 comes in. Although the postmodern information order is not clear to us, it seems to be the dynamic behind Web 2.0, in which interactive tools such as blogs, wikis, RSS facilitate social networking and the anarchic storage of unrestrained distribution of content. According to Joint, much of our professional efforts to impose a realist-modernist model on our library will fail. The old LIS model needs to be re-theorized, just as Newtonian Physics had to evolve into Quantum Theory, in recognition of the fact that super-small particles simply were not physically located where Newtonian Physics said they should be. In this light, perhaps this is where we can start to understand what exactly is Web 2.0. And beyond.
Friday, November 16, 2007
Semantic Web: A McCool Way of Explaining It
Reason? Knowledge representation is a technique with mathematical roots in the work of Edgar Codd, widely known as the one whose original paper using set theory and predicate calculus led to the relational database revolution in the 1980's. Knowledge representation uses the fundamental mathematics of Codd's theory to translate information, which humans represent with natural language, into sets of tables that use well-defined schema to defined schema to define what can be entered in the rows and columns.
The problem is that this creates a fundamental barrier, in terms of richness of representation as well as creation and maintenance, compared to the written language that people use. Logic, which forms the basis of OWL, suffers from an inability to represent exceptions to rules and the contexts in which they're valid.
Databases are deployed only by corporations whose information-management needs require them or by hobbyists who believe they can make some money from creating and sharing their databases. Because information theory removes nearly all context from information, both knowledge representation and relational databases represent only facts. Complex relationships, exceptions to rules, and ideas that resist simplistic classifications pose significant design challenges to information bases. Adding semantics only increases the burden exponentially.
Because it's a complex format and requires users to sacrifice expressively and pay enormous costs in translation and maintenance, McCool believes Semantic Web will not achieve widespread support. Never? Not until another Edgar Codd comes along our way. So we wait.
Wednesday, November 14, 2007
The Postmodern Librarian?
According to Joint, the idea of the postmodern digital library is clearly very different from the interim digital library. In the summer of 2006, a workshop at the eLit conference in Loughborough on the cultural impact of mobile communication technologies, there emerged the Five Theses of Loughborough. Here they are:
(1) There are no traditional information objects on the internet with determinate formats or determinate formats or determinate qualities: the only information object and information forat on the internet is "ephemera"
(2) The only map of the internet is the internet itself, it cannot be described
(3) A hypertext collection cannot be selectively collected because each information object is infinite and infinity cannot be contained
(4) The problem of digital preservation is like climate change; it is man-made and irreversible, and means that much digital data is ephemeral; but unlike climate change, it is not necessarily catastrophic
(5) Thus, there is no such thing as a traditional library in a postmodern world. Postmodern information sets are just as accessible as traditional libraries;: there are no formats, no descriptions, no hope of collection management, no realistic possibility of preservation. And they work fine.
Monday, November 12, 2007
New York City In a Semantic Web
We fit each car in New York City with a device that lets a reverse geographical position system reads its movements. Suppose, in addition, that another machine can predict the weather or some other phenomenon that impacts traffic. Assume that a third kind of device has the public transport timetables. Then, data from a collaborative knowledge picture of these machines can be used to advise on the best means of transportation for reaching a certain destination within the next few hours.The computer systems doing the calculations required for the traffic advisory are likely to be controlled by different bodies, such as the city authority or the national weather service. Therefore, there must be a way for software agents to process the information from the machine where it resides, to proceed with further processing of that information to a form in which a software agent of the final user can be used to query the dataset.
Wednesday, November 07, 2007
Genre Searching
In genre analysis, three steps must be taken:
(1) Identify - The core genre repertoire of the work domain
(2) Develop - A standard taxonomy to represent it
(3) Develop - Operational definitions of the genre classes in the taxonomy, including identifying features in terms of form, function and content to facilitate manual and automatic genre classification.
Throughout the entire presentation, my mind kept returning to the question: is this not another specialized form of social searching? A tailorized search engine which narrows its search to a specific genre? Although the two are entirely different things, I keep thinking that creating your own search engine is certainly much easier.
Simple Knowledge Organization System (SKOS) & Librarians
It's interesting that the very essence of librarianship and cataloging will play a vital role in the upcoming version of the Web. It's hard to fathom how this works: how can MARC records and the DDC have anything to do with the intelligent agents which form the layers of architecture of the Semantic Web and Web 3.0? The answer: metadata.
And even more importantly: the messiness and disorganization of the Web will require information professionals with the techniques and methods to reorganize everything coherently. Web 1.0 and 2.0 were about creating -- but the Semantic Web will be about orderliness and regulating. By controlled structured vocabulary, SKOS is built on the following features. Take a closer look at Miles & Perez-Aguera's article -- it's well worth a read.
(1) Thesauri - Broadly conforming to the ISO 2788:1986 guidelines such as the UK Archival Thesaurus (UKAT, 2004), the General Multilingual Environmental Thesaurus (GEMET), and the Art and Architecture Thesaurus
(2) Classification Schemes - Such the Dewey Decimal Classification (DDC), the Universal Decimal Classification (UDC), and the Bliss Classification (BC2)
(3) Subject Heading Systems - The Library of Congress Subject Headings (LCSH) and the Medical Subject Headings (MeSH)
Friday, November 02, 2007
New Librarians, New Possibilities?
University of Guelph Chief Librarian Michael Ridley, similarly sees a future where the university library serves as an “academic town square,” a place that "brings people and ideas together in an ever-bigger and more diffuse campus. Services in the future will include concerts, lectures, art shows – anything that trumpets the joy of learning."
Is this the future of libraries? Yes, it's a matter of time. That's where we're heading -- that's where we'll end up. It is a matter of time. Change is difficult, particularly in larger academic institutions where bureaucracy and politics play an essential role in all aspects of operations. There is great skepticism towards Jeff Trzeciak's drastic changes to McMaster Library -- he's either a pioneer if he succeeds, or an opportunist if he fails. A lot is riding on Jeff's shoulders.
Tuesday, October 30, 2007
Introducing Semantic Searching
(1) Ontological Semantics (OntoSem) - A formal and comprehensive linguistic theory of meaning in natural language. As such, it bears significantly on philosophy of language, mathematical logic, and cognitive science
(2) Query Detection and Extraction (QDEX) - A system invented to bypass the limitations of the inverted index approach when dealing with semantically rich data
(3) SemanticRank algorithm - Deploys a collection of methods to score and rank paragraphs that are retrieved from the QDEX system for a given query. The process includes query analysis, best sentence analysis, and other pertinent operations
(4) Dialogue - In order establish a human-like dialogue with the user, the dialogue algorithm's goal is to convert the search engine's role into a computerized assistant with advanced communication skills while utilizing the largest amount of information resources in the world.
(5) Search mission - Google mission was to organize the world's information and make it universally accessible and useful. hakia's mission is to search for better search.
Monday, October 22, 2007
A Defintion of the Semantic Web
Today's web pages are designed for human use, and human interpretation is required to understand the content. Because the content is not machine-interpretable, any type of automation is difficult. The Semantic Web augments today's web to eliminate the need for human reasoning in determining the meaning of web-based data. The Semantic Web is based on the concept that documents can be annotated in such a way that their semantic content will be optimally accessible and comprehensible to automated software agents and other computerized tools that function without human guidance. Thus, the Semantic Web might have a more significant impact in integrating resources that are not in a traditional catalog system than in changing bibliographic databases.
Thursday, October 11, 2007
Three Perspectives of the Semantic Web
(1) A Universal Library - Readily accessed and used by humans in a variety of information use and contexts. This perspective arose as a reaction to the disorder of the Web, which was not ordered in categorization until search engines came along. Metadata, cataloguing, and schemas were seen as the answer.
(2) Computational Agents - Completing sophisticated activities on behalf of their human counterparts. Tim Berners-Lee envisioned an infrastructure for knowledge acquisition, representation, and utilization across diverse use contexts. This global knowledge base wil be used by personal agents to collect and reason about information, assisting people with tasks common to everyday life.
(3) Federated Data and Knowledge Base - In this vision, federated components are developed with some knowledge of another or at least with a shared anticipation of the type of applications that will use the data. In essence, this Web encompasses languages used for syntactically sharing data rather than having to write specialized converters for each pair of languages.
Wednesday, October 10, 2007
Knowledge Management 3.0
Stage 1 - Internet of Intellectual Capital - this initial stage of KM was driven primarily by IT. In this stage, organizations realized that their stock in trade was information and knowledge -- yet the left hand rarely knew what the right hand did. When the Internet emerged, KM was about how to deploy the new technology to accomplish those goals.
Stage 2 - Human & Cultural dimensions - the hallmark phrase is communities of practice. KM during this stage was about knowledge creation as well as knowledge sharing and communication.
Stage 3 - Content & Retrievability - consists of structuring content and assigning descriptors (index terms). In content management and taxonomies, KM is about arrangement description, and structure of that content. Interestingly, taxonomies are perceived by the KM community as emanating from natural scientists, when in fact they are the domain of librarians and information scientists. To take this one step further, The Semantic Web is also built on taxonomies and ontologies. Anyone see a trend? Perhaps a convergence?
Monday, October 08, 2007
When is an Apple, an Apple?
I argue that we can go one step further because with the advent of Web 2.0, social search is actually the closest that we have to gathering input from all of the world’s users. How? Why? Let me explain with an analogy.
It’s not a matter of how, but a matter of when. Web 2.0 is very much like an apple. An apple can be food, a paperweight, a target, or a weapon if needed. It can be whatever you want it to be when you want it to be. The same goes for social searching. It is not search engines.
Del.icio.us is a social bookmarking web service. But it can be a powerful search tool if used properly; essentially, it taps into the social preferences of other users. Same goes for Youtube: it’s a video sharing website, but what’s to say that it can’t be used for searching videos for relevant topics, what’s to say that you can’t search related videos based on videos bookmarked by others? Social search is not based on program; it is mindset, a metaphorical sweet fruit, if you will.
In many ways, social searching is not unlike what librarians did (and still do) in the print-based world where an elegant craft of creativity and perserverence was required to find the right materials and putting them into the hands of the patron; the only difference is that the search has become digital.
Friday, October 05, 2007
Youtube University
Wednesday, October 03, 2007
Of Ontologies + Taxonomies
(1) Taxonomies: An Important Part of the Semantic Web - The new Web entails adding an extra layer of infrastructure to the current HTML Web - metadata in the form of vocabularies and the relationships that exist between selected terms will make this possible for machines to understand conceptual relationships as humans do.
(2) Defining Ontologies and Taxonomies - Ontologies and taxonomies are used synonymously -- Computer Scientists refer to hierarchies of structured vocabularies as "ontology" while librarians call them "taxonomy."
(3) Standardized Language and Conceptual Relationships - Both taxonomies and ontologies consist of a structured vocabulary that identifies a single key term to represent a concept that could be described using several words.
(4) Different Points of Emphasis - Computer Science is concerned with how software and associated machines interact with ontologies; librarians are concerned with how patrons retrieve information with the aid of taxonomies. However, they're essential different sides of the same coin.
(5) Topic Maps As New Web Infrastructure - Topic maps will ultimately point the way to the next stage of the Web's development. They represent a new international standard (ISO 13250). In fact, even the OCLC is looking to topic maps in its Dublin Core Initiative to organize the Web by subject.
Monday, October 01, 2007
Web 3.0 Librarian
It's not unlike the library before Melvil Dewey introduced the idea of organizing and cataloguing books in a classification system. In many ways, we see the parallels here 130 years later. It's not surprising at all to see the OCLC at the forefront in developing Semantic Web technologies. Many of the same techniques of bibliographic control apply to the possibilities of the Semantic Web. It was the computer scientists and computer engineers who had created Web 1.0 and 2.0, but it will ultimately be individuals from library science and information science who will play a prominent role in the evolution of organizing the messiness into a coherent whole for users. Are we saying that Web 2.0 is irrelevant? Of course not. Web 2.0 is an intermediary stage. Folksonomies, social tagging, wikis, blogs, podcasts, mashups, etc -- all of these things are essential basic building blocks to the Semantic Web.