Don't blink. It's only five years away. Inundated with the day-to-day duties working in a large academic library has sometimes removed me from the "larger" picture of what libraries look like not only to users, but ultimately how libraries will look like in the future. I've written a great deal about the Semantic Web and Web 2.0; but how do they fit libraries: physically and conceptually? Visions: The Academic Library in 2012 offers a meta-glimpse of how libraries might look like in 2012. As you'll notice, some of the features are suspiciously Web 2.0 and Library 2.0? Let's take a look, shall we?Wednesday, January 02, 2008
11 Ways to the Library of 2012
Don't blink. It's only five years away. Inundated with the day-to-day duties working in a large academic library has sometimes removed me from the "larger" picture of what libraries look like not only to users, but ultimately how libraries will look like in the future. I've written a great deal about the Semantic Web and Web 2.0; but how do they fit libraries: physically and conceptually? Visions: The Academic Library in 2012 offers a meta-glimpse of how libraries might look like in 2012. As you'll notice, some of the features are suspiciously Web 2.0 and Library 2.0? Let's take a look, shall we?Tuesday, December 25, 2007
Happy Holidays and Seasons Greetings
Seasons Greetings to all. It is indeed a wonderful holidays as the Google Scholar has published an important piece to the Semantic Web literature. He's done it again, writing an concise and cogent piece on the key elements which differentiates Web 3.0 from Web 2.0. In other news, a reader recently made a comment from a previous entry which I found to be very interesting. Here's what he said:I (as a librarian) found the article and the whole topic very important. I especially enjoyed the conclusion. You wrote that "Web 3.0 is about bringing the miscellaneous back together meaningfully after it's been fragmented into a billion pieces."I was wondering if in your opinion this means that the semantic web may turn a folksonomy into some kind of structured taxonomy. We all know the advantages and disadvantages of a folksonomy. Is it possible for web 3.0 to minimize those disadvantages and maybe even make good use out of them?
(3) Such a use of folksonomies could help overcome some of the inherent difficulties in ontology construction, thus potentially bridging Web 2.0 and the Semantic Web. By using folksonomies' collective categorization scheme as an initial knowledge base for constructing ontologies, the ontology author could then use the tagging distribution's most common tags as concepts, relations, or instances. Folksonomies do not a Semantic Web make -- but it's a good start.
Thursday, December 20, 2007
Information Science As Web 3.0?
In the early and mid-1950’s, scientists, engineers, librarians, and entrepreneurs started working enthusiastically on the problem and solution defined by Vannevar Bush. There were heated debates about the “best” solution, technique, or system. What ultimately ensued became information retrieval (IR), a major subfield of Information Science.In his article Information Science, Tefko Saracevic makes a bold prediction:fame awaits the researcher(s) who devises a formal theoretical work, bolstered by experimental evidence, that connects the two largely separated clusters i.e. connecting basic phenomena (information seeking behaviour) in the retrieval world (information retrieval). A best seller awaits the author that produces an integrative text in information science. Information Science will not become a full-fledged discipline until the two ends are connected successfully.
As Saracevic puts it, IR is one of the most widely spread applications of any information system worldwide. So how come Information Science has yet to produce a Nobel Prize winner?
As I've opined before, LIS will play a prominent role in the next stage of the Web. So who's it gonna be?
Tuesday, December 18, 2007
The Semantic Solution - A Browser?
Semantic Web browser—an end user application that automatically locates metadata and assembles point-and-click interfaces from a combination of relevant information, ontological specifications, and presentation knowledge, all described in RDF and retrieved dynamically from the Semantic Web. With such a tool, naïve users can begin to discover, explore, and utilize Semantic Web data and services. Because data and services are accessed directly through a standalone client and not through a central point of access . . . . new content and services can be consumed as soon as they become available. In this way we take advantage of an important sociological force that encourages the production of new Semantic Web content by remaining faithful to the decentralized nature of the Web
I like this idea of a portal. To have everyone agree about how to implement W3C standards - RDF, SPARQL, OWL - is unrealistic. Not everyone will accept the extra work for no real sustainable incentive. That is perhaps why there is no current real invested interest by companies and private investors to channel funding to Semantic Web research. However, the Semantic Web portal is one method to combat the malaise. In many ways, it resembles the birth of Web 1.0, before Yahoo!'s remarkable directory and search engines. All we need is one Jim Clark and one Marc Andreeson, I guess.
(Maybe a librarian and an information scientist, or two?)
Friday, December 14, 2007
"Web 3.0" AND OR the "Semantic Web"
Although I have worked in health research centres and medical libraries, I have never worked professionally as a librarian in a health setting. That is why I have great admiration for health librarians such as The Google Scholar, who can multitask, working as a top-notch librarian while at the same time keeping up with cutting edge technology. The Google Scholar recently made a wonderful entry about Web 3.0 and the Semantic Web:In medicine, there is virtually no discussion about web 3.0 (see this PubMed search for web 3.0 (zero results) and most of the discussion on the semantic web (see this PubMed search - ~100 results) is from the perspective of biology/ bioinformatics.
The dichotomy in the literature is both perplexing and unsurprising. On the one hand, semanticists are looking at a new intelligent web has 'added meaning' to documents, and machine interoperability. On the other, web 3.0 advocates use '3.0' to be trendy, hip or to market themselves or their websites. That said, I prefer the web 3.0 label to the semantic web because it follows web 2.0 and suggests continuity.
It is important that medical librarians -- all librarians for that matter -- join in (and even lead) the discourse, particularly since the Semantic Web & Web 3.0 will be based heavily on the principles of knowledge and information organization. Whereas Web 1.0 and 2.0 could not distinguish among Acetaminophen, Paracetamol, and Tylenol -- Web 3.0 will.
Tuesday, December 11, 2007
Google and End of Web 2.0
Google Scholar recently celebrated its third birthday. There were some old friends who showed up at the party (the older brother Google arrived a bit late though) -- but overall, it was a fairly quiet evening atop of Mountain View. So where are we now with Google Scholar? Has the tool lived up to its early hype? What improvements have been made to Scholar in the past year? In a series of fascinating postings, my colleague, The Google Scholar, made some insightful comments, particularly when he argues:What Google scholar has done is bring scholars and academics onto the web for their work in a way that Google alone did not. This has led to a greater use of social software and the rise of Web 2.0. For all its benefits, Web 2.0 has given us extreme info-glut which, in turn, will make Web 3.0 (and the semantic web) necessary.
I agree. Google Scholar (and Google) are very much Web 2.0 products. As I had elaborated in my previous entry, AJAX (which is Web 2.0-based), produced many remarkable programs such as Gmail and Google Earth.
Was this destiny? Not really. As Yihong Ding proposes, Web 2.0 did not choose Google; rather, it was Google that had decided to follow Web 2.0. If Yahoo had only known about the politics of the Web a little earlier, it might have precluded Google. (But that's for historians to analyze). Yahoo! realized the potential of Web 2.0 too late; it purchased Flickr without really understanding how to fit it into Yahoo!'s Web 1.0 universe.
Back to Dean's point. Google's strength might ultimately lead to its own demise. The PageRank algorithm might have a drawback similar to Yahoo!'s once dominant directory. Just as Yahoo! failed to catch up with the explosion of the Web, Google's PageRank will slowly lose its dominance due to the explosion caused by Web 2.0. With richer semantics, Google might not be willing to drastically alter its algorithm since it is Google's bread-and-butter. So that is why Google and Web 2.0 might be feeling the weight of the future fall too heavily on their shoulders.
Sunday, December 09, 2007
AJAX'ing our way to Web 2.0
Part of my day job entails analyzing technologies and how they better serve users. But one of the things we seem to forget when promoting Web 2.0 is the flaws it brings with it. Because one of the core technologies of Web 2.0 is AJAX, I've been looking around for a good analysis of it. David Best's Web 2.0: Next Big Thing or Next Big Internet Bubble seems to do the job. AJAX is a core component of Web 2.0, as it introduces an engine that runs on the client side - the Web browser. Certain actions can be carried out in the engine and need no data transfer to the server; thus, they are carred out only on the client's computer and is thus quite fast, comparable to desktop applications. In the HTML-world of Web 1.0, a Web page has to completely reload after a user action, such as clicking on links, or entering data in a form.Thursday, December 06, 2007
Are You Ready For Library 3.0?
Are you ready for Library 2.0? We might just be too late because Library 3.0 is just around the corner according to some observers. How can libraries learn from the other service industries, how will librarians keep up with subject specific skills (evidence-based medicine, law, problem-based learning? Are librarian skills out of alignment with these trends? As Saw and Todd point out in Library 3.0: Where Are Our Skills, the future of academic libraries will be a digital one, where the successful librarian will be flexible, adaptable, and multi-skilled in order to survive in an environment of constant and rapid change. Drivers for change will require this new generation of librarians to navigate not only new technologies as well as understanding their users’ behaviour, but ultimately themselves (Generation X and Y’s). So what are some attributes of Librarian 3.0?It is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change
Tuesday, December 04, 2007
I See No Forests But the Trees . . .
The transition from Web 1.0 to Web 2.0 is not supervised. W3C had not launched a special group for a plot of Web 2.0; and neither did Tim O'Reilly though he was one of the most insightful observers who caught and named this transition and one of the most anxious advocates of Web 2.0. In comparison, W3C did have launched a special group about Semantic Web that was engaged by hundreds of brilliant web researchers all over the world. The progress of WWW in the past several years, however, shows that the one lack of supervision (Web 2.0) advanced faster than the one with lots of supervision (Semantic Web). This phenomenon suggests the existence of web evolution laws that is objective to individual willingness.Even Tim O'Reilly pointed out that Web 2.0 largely came out of a conference when exhausted software engineers and computer programmers from the dot.com disaster saw common trends happening on the Web. Nothing is scripted in Web 2.0. Perhaps that's why there can never be a definitive agreement on what it constitutes. As I give instructional sessions and presentations of Web 2.0 tools, sometimes I wonder, how wikis, blogs, social bookmarking, and RSS feeds will look like two years from now. Will they be relevant? Or will they transmute into something entirely different? Or will we continue on as status quo?
Is Web 2.0 merely an interim to the next planned stage of the Web? Are we seeing trees, but missing the forest?
Friday, November 30, 2007
Digital Libraries in the Semantic Age
Brian Matthews of CCLRC Appleton Laboratory offers some interesting insights in Semantic Web Technologies. In particular, he argues that libraries are increasingly converting themselves to digital libraries. A key aspect for the Digital library is the provision of shared catalogues which can be published and browsed. This requires the use of common metadata to describe the fields of the catalogue (such as author, title, date, and publisher), and common controlled vocabularies to allow subject identifiers to be assigned to publications.As Matthew proposes, by publishing controlled vocabularies in one place, which can then be accessed by all users across the Web, library catalogues can use the same Web-accessible vocabularies for cataloguing, marking up items with the most relevant terms for the domain of interest. Therefore, search engines can use the same vocabularies in their search to ensure that the most relevant items of information are returned.
The Semantic Web opens up the possibility to take such an approach. It offers open standards that can enable vendor-neutral solutions, with a useful flexibility (allowing structured and semi-structured data, formal and informal descriptions, and an open and extensible architecture) and it helps to support decentralized solutions where that is appropriate. In essence, RDF can be used as this common interchange for catalogue metadata and shared vocabulary, which can then be used by all libraries and search engines across the Web.
But in order to use the Semantic Web to its best effect, metadata needs to be published in RDF formats. There are several initiatives involved with defining metadata standards, and some of them are well known to librarians:
(1) Dublin Core Metadata Initiative
(2) MARC
(3) ONIX
(3) PRISM
Wednesday, November 21, 2007
Postmodern Librarian - Part Two
To continue where we had left off. True, Digital Libraries and the Future of the Library Profession intimates that libraries and perhaps librarianship has entered the postmodern age. But Joint hasn't been the first to author such an argument; many others have also argued likewise. In fact, I had written about it before, too. But I believe to stop at the modernist-postmodernist dichotomy misses the point.In my opinion, perhaps this is where Web 2.0 comes in. Although the postmodern information order is not clear to us, it seems to be the dynamic behind Web 2.0, in which interactive tools such as blogs, wikis, RSS facilitate social networking and the anarchic storage of unrestrained distribution of content. According to Joint, much of our professional efforts to impose a realist-modernist model on our library will fail. The old LIS model needs to be re-theorized, just as Newtonian Physics had to evolve into Quantum Theory, in recognition of the fact that super-small particles simply were not physically located where Newtonian Physics said they should be. In this light, perhaps this is where we can start to understand what exactly is Web 2.0. And beyond.
Friday, November 16, 2007
Semantic Web: A McCool Way of Explaining It
Yahoo's Rob McCool argues in Rethinking the Semantic Web, Part 1 that the Semantic Web will never happen. Why? Because the Semantic Web has three fundamental parts, and they just don't fit together based on current technologies. Here is what we have. The foundation is the set of data models and formats that provide semantics to applications that use them (RDF, RDF Schema, OWL). The second layer is composed of services - purely machine-accessible programs that answer Web requests and perform actions in response. At the top are the intelligent agents, or applications.Reason? Knowledge representation is a technique with mathematical roots in the work of Edgar Codd, widely known as the one whose original paper using set theory and predicate calculus led to the relational database revolution in the 1980's. Knowledge representation uses the fundamental mathematics of Codd's theory to translate information, which humans represent with natural language, into sets of tables that use well-defined schema to defined schema to define what can be entered in the rows and columns.
The problem is that this creates a fundamental barrier, in terms of richness of representation as well as creation and maintenance, compared to the written language that people use. Logic, which forms the basis of OWL, suffers from an inability to represent exceptions to rules and the contexts in which they're valid.
Databases are deployed only by corporations whose information-management needs require them or by hobbyists who believe they can make some money from creating and sharing their databases. Because information theory removes nearly all context from information, both knowledge representation and relational databases represent only facts. Complex relationships, exceptions to rules, and ideas that resist simplistic classifications pose significant design challenges to information bases. Adding semantics only increases the burden exponentially.
Because it's a complex format and requires users to sacrifice expressively and pay enormous costs in translation and maintenance, McCool believes Semantic Web will not achieve widespread support. Never? Not until another Edgar Codd comes along our way. So we wait.
Wednesday, November 14, 2007
The Postmodern Librarian?
Are we in the postmodern era? Nicholas Joint's Digital Libraries and the Future of the Library Profession seems to think so. In it, he argues that unique contemporary cultural shifts are leading to a new form of librarianship that can be characterized as "postmodern" in nature, and that this form of professional specialism will be increasingly influential in the decades to come.According to Joint, the idea of the postmodern digital library is clearly very different from the interim digital library. In the summer of 2006, a workshop at the eLit conference in Loughborough on the cultural impact of mobile communication technologies, there emerged the Five Theses of Loughborough. Here they are:
(1) There are no traditional information objects on the internet with determinate formats or determinate formats or determinate qualities: the only information object and information forat on the internet is "ephemera"
(2) The only map of the internet is the internet itself, it cannot be described
(3) A hypertext collection cannot be selectively collected because each information object is infinite and infinity cannot be contained
(4) The problem of digital preservation is like climate change; it is man-made and irreversible, and means that much digital data is ephemeral; but unlike climate change, it is not necessarily catastrophic
(5) Thus, there is no such thing as a traditional library in a postmodern world. Postmodern information sets are just as accessible as traditional libraries;: there are no formats, no descriptions, no hope of collection management, no realistic possibility of preservation. And they work fine.
Monday, November 12, 2007
New York City In a Semantic Web
Tim Krichel in The Semantic Web and an Introduction to Resource Description Framework makes a very astute analogy for understanding the technology behind the Semantic Web, particularly the nuances of XML and RDF, where the goal is to move away from the present Web - where pages are essentially constructed for use by human consumption - to a Web where more information can be understood and treated by machines. The analogy goes like this:We fit each car in New York City with a device that lets a reverse geographical position system reads its movements. Suppose, in addition, that another machine can predict the weather or some other phenomenon that impacts traffic. Assume that a third kind of device has the public transport timetables. Then, data from a collaborative knowledge picture of these machines can be used to advise on the best means of transportation for reaching a certain destination within the next few hours.The computer systems doing the calculations required for the traffic advisory are likely to be controlled by different bodies, such as the city authority or the national weather service. Therefore, there must be a way for software agents to process the information from the machine where it resides, to proceed with further processing of that information to a form in which a software agent of the final user can be used to query the dataset.
Wednesday, November 07, 2007
Genre Searching
At today's SLAIS colloquium, Dr. Luanne Freund gave a presentation on Genre Searching: A Pragmatic Approach to Information Retrieval. Freund argues for taking a pragmatics approach in genre searching and genre classification. But there are two perspectives of pragmatics: socio-pragmatic and cognitive-pragmatic. Using a case study, a high-tech firm, Freund and her colleagues built a unique search engine called X-Cite, which culls together documents from the corporate intranet (which include anything from FAQ's to specialize manuals) with tags. In ranking documents based on title, abstract, and keywords as part of the search engine, the algorithm uniquely cuts down on the ambiguity and guesswork of searching. Using a software engineering workplace domain as its starting point, Freund believes that genre searching has the potential to make a significant contribution to the effectiveness of workplace search systems, by incorporating genre weights into the ranking algorithm.In genre analysis, three steps must be taken:
(1) Identify - The core genre repertoire of the work domain
(2) Develop - A standard taxonomy to represent it
(3) Develop - Operational definitions of the genre classes in the taxonomy, including identifying features in terms of form, function and content to facilitate manual and automatic genre classification.
Throughout the entire presentation, my mind kept returning to the question: is this not another specialized form of social searching? A tailorized search engine which narrows its search to a specific genre? Although the two are entirely different things, I keep thinking that creating your own search engine is certainly much easier.