Allan's Library

Thursday, March 27, 2008

The Social Web Into the Semantic Web

"What can happen if we combine the best ideas from the Social Web and the Semantic Web?" - Tom Gruber

In other words, can we channel folksonomies, tagging, user-created knowledge into one coherent structured Web? A Semantic Web? Tom Gruber seems to think so. In Collective Knowledge Systems, he proposes the Semantic Web vision points to a representation of the entity - for example, a city - rather than its surface manifestation. Therefore, one of the problems that we've always had accessing the Web's content is the difficulty in differentiating the city of Paris from the celebrity Paris Hilton when using a search engine.

In many ways, harnessing Web 2.0 technologies and refining them for the Semantic Web has been speculated a great deal. How do we move from collected intelligence to collective intelligence? There are three approaches to realizing the Semantic Web. Here they are:

(1) Expose structured data that already underlies unstructured web pages - Site builders would generate unstructured web pages from a database and expose the data using standard formats (think FOAF)

(2) Extract structured data from unstructured user contributions - Manually dentify people, companies, and other entities with proper names, products, instances of relations

(3) Capture structured data on the way into the system - A "snap to grid" system in which users enter structure to the data and helps users enter data within the structure. (Think of automatic spell check).

Where do librarians come in? We have always used our training to structure content, package it, and disseminate to our users. In our article, Dean and I argue that the catalogue is very much an analogy for how the Semantic Web can organize information in a way that the current Web is unable to do. Recent developments in RDA from the library side offer a promising glimpse into the possibilities for Web 3.0. True, we are only surmising. But let's not prevent us from creating.

Tuesday, March 25, 2008

Quantum Information Science?

Have you heard of quantum information science? Eventually, it might solve the problems of information-mess and access. Although quantum physics, information theory, and computer science were among the apex of intellectual achievements of the 20th century, they were often framed as separate entities. Currently, a new synthesis of these themes is quietly emergine. The emerging field of quantum information science is offering important insights into fundamental issues at the interface of computation and physical science, and may guide the way to revolutionary technological advances.

Director of the Institute for Quantum Information, John Preskill proposes in his lecture, that quantum bits (“qubits”), the indivisible units of quantum information, will be central for “quantum cryptography,” wherein the privacy of secret information can be founded on principles of fundamental physics. The quantum laws that govern atoms and other tiny objects differ radically from the classical laws that govern our ordinary experience. Physicists are beginning to recognize that we can put the weirdness to work. That is, there are tasks involving the acquisition, transmission, and processing of information that are achievable in principle because Nature is quantum mechanical, but that would be impossible in a "less weird" classical world.

What does this mean ultimately mean? A “quantum computer” operating on just a few hundred qubits could perform tasks that ordinary digital computers could not possibly emulate. Although constructing practical quantum computers will be tremendously challenging, particularly because quantum computers are far more susceptible to making errors than conventional digital computers, newly developed principles of fault-tolerant quantum computation may enable a properly designed quantum computer with imperfect components to achieve reliability. How long will it take before we achieve quantum computing? Please be patient. These folks are working on it.

Friday, March 21, 2008

Free on CBC

The Canadian Broadcasting Corporation, long known for its traditional family-style programs (Road to Avonlea and Coronation Street) and NHL hockey, is actually making a splash in technology. A huge one at that. It's decided to apply the 1% principle and open up its content for anyone to freely download. That's right. Free.

In doing so, CBC becomes the major broadcaster in North America to release a high quality, DRM-free copy of a primetime show using BitTorrent technology. On top of that, CBC will also be distributing a version that can put in iPod's. The show, Canada’s Next Great Prime Minister, will completely free (and legal) for anyone to download, share & burn to the heart’s desire. For many, Bit Torrent has meant illegal, downright dirty business. In the future, however, it might actually be a better means to access for information and entertainment. CBC is attempting to prove that there are other means beyond the "box." It's trying to move past physical barriers and into the virtual. Shouldn't libraries be doing the same?

Sunday, March 16, 2008

5 Essences to Librarianship 3.0

What will the future of librarianship look like? Traditional cataloging, collection development, and reference will look very different, even five years from now. Changes are in motion. Don't you get the feeling that things are going to be fast and furious? There seems to be a lot of anxiety and uncertainty among librarians about what the future holds. But change is inevitable in life. From the card catalog to OPACs to the Internet, librarians and information professionals have had to adjust and adapt accordingly to new technologies. But unlike other professions that rely on technology, it's always had to catch up rather than take the lead. But we might not have a choice in the new Web. Here are 5 opportunities for us to look ahead to.

(1) Resource Description and Access - With the Anglo-American Cataloging Rules 2 (AACR2) moving way for its successor, the RDA will play an essential role for how information is to be classified and held in libraries and information organizations. However, the RDA will move beyond just the physical and include Web resources as well. You may ask, how can we catalog something that changes constantly? That's where the Semantic Web comes in.

(2) Information Architecture - Librarians have had to organize information. It's their jobs. As Web become more integrated into their work (as if it weren't already?), librarians will rely ever more so on the Web to conduct their work with patrons. Digital outreach is the key to survival. In order to achieve this, building accessible and user-centred websites will be essential.

(3) Virtual Worlds - Everywhere gate counts are going down in libraries. Patrons are frequenting libraries less and less for information seeking, and more for products and spaces. This means that reference librarianship is changing, too. To a certain extent, we've experimented with virtual reference. In the future, if we are to embrace the possibilities of how we can bring our expertise to the user through other means. Whether it's Facebook, MySpace, Second Life, or Meebo. Think beyond the walls.

(4) Open Access - Traditional publishing is nearing its last legs. Things fall apart; the centre cannot hold. Textbook publishers are churning new editions of the same text in order to prevent re-selling; journal publishers are forcing the print copies to be sold as a package with their electronic versions. Why? Fear. Publishers are scrambling to stay in business. Open access will open up new opportunities for how students and users buy books. Why not build you own textbook?

(5) "Free-conomics" - Everything that users will want will be "free." To understand this principle, just look at the things that you are using without paying. It's based on the 1% principle, where 99% of users get access to the basics of a product while 1% of the others pay for the full premium. The spirit of librarianship has been about the principle of public good and collaboration. It's only natural we find ways to integrate the 1% principle to its full extent.

Sunday, March 09, 2008

Bill Gates Retires from Microsoft

Recently, Forbes revealed that Bill Gates has slipped to number three on the list of the world's wealthiest people. On top of that, Bill Gates is also stepping back from Microsoft to devote more time to the Bill & Melinda Gates Foundation. But that doesn't mean that Bill left with a whimper. Take a look at this video, particularly his going-away comedy skit. Nice job, Bill. Good-bye, but not farewell.

Friday, March 07, 2008

Librarians and Web 3.0

For better or worse, Web 3.0 is around the corner. Okay, maybe the technology is lagging; but we must admit that the third generation (third decade) Web is coming. In a post I had made back in September, Paul Miller of Talis made an insightful response, one which is relevant for today's discussion.

Although I'm slightly surprised at the sector's lack of overt engagement with this obviously synergistic area too, there are certainly examples in which librarians are grasping the Semantic Web and in which Semantic Web developers are recognising the rich potential offered by libraries' structured data...

Ed Summers over at Library of Congress would be one person I'd pick out to mention. Also, the work OCLC and Zepheira are doing on PURL, and our own focus on the Talis Platform within Talis; that's Semantic Web through and through, and we have significant products in the final stages of beta that put semantic technologies such as RDF and SKOS to work in delivering richer, better, more flexible applications to libraries and their users. Things really begin to get interesting, though, when you take the next step from enabling existing product areas with semantic technologies to actually beginning to leverage the resulting connections by joining data up, and reusing those links, inferences and contexts to cross boundaries between libraries, systems, and application areas.

There's also library-directed research at institutes such as DERI here in Europe, and even conferences like the International Conference on Semantic Web and Digital Libraries, which was in India this year.

Finally - for now - there's also a special issue of Library Review in preparation; Digital Libraries and the Semantic Web: context, applications and research, and I'll be speaking on The Semantic Web and libraries - a perfect fit? at the Talis Insight conference in November It's funny that you mention Jane in your post, because I'll also be doing something for her later in November that encompasses some of these themes...

Sometimes moving forward doesn't necessarily mean progress. Sometimes we need to take one step back before we can move two steps in the right direction. But it appears as if the infrastructure is there for us to move in the direction of Web 3.0. What does this mean for librarians? I suspect it means we should stop the bickering about Web versions, and start reflecting on the reasons why patrons are physically relying on library collections and coming to the libraries for information. Googlization of information has resulted in fears for the future of librarianship. But what are we to do? Standing idly by and playing the trumpets as the ship sinks isn't the right way to take it. What to do? Let's try move in the right direction.

Saturday, March 01, 2008

The Business of Free-conomics

He's done it again. Fresh off the press is Chris Anderson's "Free" in Wired Magazine. In 2004, Anderson changed the way business and the Web was conducted through his visionary Long Tail. Two years later, Anderson's back with the idea of "free." While the long tail proved the staple for Web 2.0, please put "free" into your lexicon for the upcoming Web 3.0.

Giving away things for free has been around for a long time. Think Gillette. In fact, the open source software movement is not unlike the shareware movement a decade earlier. (Remember that first game of Wolfenstein?) Like the long tail, Anderson synthesizes "Free" according to six principles:

(1) "Freemium" - Another percent principle: the 1% rule. For every user who pays for the premium version of the site, 99 others get the basic free version.

(2) Advertising - What's free? How about content, services, and software, just to name a few. Who's it free to? How about everyone.

(3) Cross-subsidies - It's not piracy even though it appears like piracy. The fact is, any product that entices you to pay for something else. In the end, everyone will to pay will eventually pay, one way or another.

(4) Zero Marginal Cost - Anything that can be distributed without an appreciable cost to anyone.

(5) Labour Exchange - The act of using sites and services actually creates something of value, either improving the service itself or creating information that can be useful somewhere else.

(6) Gift Economy - Money isn't everything in the new Web. In the monetary economy, this free-ness looks like madness; but that it's only shortsightedness when measuring value about the worth of what's created.

Tuesday, February 26, 2008

Collection Management 2.0

Librarianship sometimes feel (and sound) as if it's in disarray. The library discourse is often fractured and fragmented with so many difference viewpoints. Perhaps this is a result of being in our postmodern information age. Bodi and Maier-O'Shea's The Library of Babel: Making Sense of Collection Management in a Postmodern World asserts that libraries have to invest in and prepare for a digital future while maintaining collections and services based on a predominantly print world.

How is it that we're in postmodern world of academic library collection management? Collections are no longer limited to a physical collection in one location; rather, they are a mixture of local and remote, paper and electronic. Hence, in their experimentations of collection development at two research and liberal arts college libraries, the authors realize that there should be three principles. We aren't reinventing the wheel here; but sometimes, amidst our heavy work days and busy lives, we often forget to step back to reassess how things can be done better. The authors offer an interesting viewpoint in this light:

(1) Break down assessment by subject or smaller sub-topics when necessary

(2) Blending of variety of assessment tools appropriate to the discipline

(3) Match print and electronic collections to departmental learning outcomes through communication with faculty members

Wednesday, February 20, 2008

Top 25 Web 2.0 Tools

Jessica Hupp from College Degree has written some insightful articles about information technology. 25 Useful Social Networking Tools for Librarians might be one of the best. She profiles 25 of the best Web 2.0 tools available that librarians should consider using for their professional work. I'm just going to introduce the list. I encourage you to read her actual entry.

1. Communication - Keep in touch with staff, patrons, and more with these tools

MySpace

Facebook

Ning

Blog

Meebo

LinkedIn

Twitter

2. Distribution - Tools make it easy to share information from anywhere

Flickr

YouTube

TeacherTube

Second Life

Wikipedia

PBwiki

Footnote

Community Walk

SlideShare

Digg

StumbleUpon

Daft Doggy

3. Organization - Keep all of your information handy and accessible with these tools

aNobii

Del.icio.us

Netvibes

Connotea

LibraryThing

lib.rario.us

Thursday, February 14, 2008

The Googling Librarian

An article from the Chronicle of Higher Education popped up which once again highlighted the information (or lack of) needs of college students. It has been a recent phenomenon -- this argument and counter-arguments of the necessity of libraries and librarians in the face of Google-ization. For every viewpoint that the Internet has replaced the information services of libraries, there is the stance that users' are even more confused about information overload and the mess that is the Web.

I tend to agree with a what Dennis Dillon says in a new article, Google, Libraries, and Knowledge Management: From the Navajo to the National Security Agency. Libraries and the 'Net play are different entities: libraries play the library game, not the information game. Google is the same for everyone. It is not tailored for different user groups, and it does not change, as local users need shift. Google's very nature is different from that of libraries.

Here's the kicker folks: We could wake up tomorrow to the news that a banking conglomerate has purchased Google and intends to turn it into a private corporate information tool, and wants to convert the content to French. Although just a silly hypothetical situation, Dillon makes a good point that the nature of people and organizations such as Google are not playing the same games as libraries.

Perhaps this is what libraries with foresight such as McMaster University Libraries are doing. They're integrating new technologies to supplement and complement existing facilities. Before it's too late. I personally talk a great deal about emergent technologies, particularly Web 3.0 and the Semantic Web, but in the end, I believe that these are mere tools that facilitate for the growing organism of libraries. In the end, interior design is as every bit relevant to how users perceive the physical spaces of the library as Facebook's uses for increasing outreach to students. But put the two together: and we pack a powerful punch. Dillon leaves us with a freshly yet somewhat disconcerting commenting:

Libraries have become so enamoured of technology that we sometimes cannot see what is in front of our faces, which is that there are still people in our buildings and they are there for a reason.

Wednesday, February 06, 2008

The Future of Digital Librarians

My colleague and mentor The Google Scholar discussed a bit about the Semantic Web and Web 2.0. Is it relevant to the profession of librarianship? Absolutely. How do we achieve it? Edie Rasmussen and Youngok Choi released a study in 2006 that surveys the skills that practitioners lack in What is Needed to Educate Future Digital Librarians. In this study, the two authors found that while many librarians are young and fresh out of graduate LIS school, they often lack the skills that are necessary for them to thrive in the increasingly digital world of libraries. LIS curricula are often limited to introductory classification and rudimentary information technology courses. There appears to be a real disjunct between the actual job descriptions that are required for newer positions and the actual skills that librarians receive in LIS school. Rasmussen and Choi's study finds that respondents are often frustrated over the "training gaps" during their studies for the following:

(1) Overall understanding of the complex interplay of software

(2) Lack of vocabulary to communicate to technical staff

(3) Knowledge of Web-related languages and technologies

(4) Web design

(5) Digital imaging and formatting

(6) Digital technology

(7) Programming and scripting languages

(8) XML standards and technologies

(9) Basic systems administration

In my own experience as an information professional, I find that these skills are sorely lacking in my own education. I'm finding it increasingly my own initiative to get caught up in the literature and the technologies. Who really has time to learn OAI-PMH metadata standards, XML, EAD, and TEI? Many librarians keep abreast of their field -- but on top of their current duties. But the problem remains that LIS schools do not to train technicians even though that is what they're doing - their mandate is to nurture scholars. Which I can understand. Yet, we can't fit a square peg into a circle. There lies the conundrum: something's got to give. But what? That has remained the intense tension in the field of LIS since its inception. With the advent of the Web and newer technologies, this gap will only widen.

Thursday, January 31, 2008

Web 3.0 as in Automation?

I often wonder what kind of automation will make it possible for the Semantic Web. I know there needs to be an automated web browser (or something similar), but what would it look like? The solution could look something like Automatic Character Switch (ACtS), which is a strategy and a philosophy rather than a standard, meaning community moderators can independently implement their own ACtS methods. Similar to AJAX, ACtS is invoked only when it is necessary; that is, only when a web space is connected to a community.

So what is ACtS? According to Yihong Ding, ACtS only allows different communities to recognize whatever they can identify from a web space. A web user can set up a local web spce that stores his web resources. When he subscribes to a new web community, he uploads his local web space to the site while the site customizes its resources based on the community specifications. ACts begins with a user's subscribing a web space to a community. The community server thus performs a community-sensitive resource identification procedure to categorize (information retrieval) and annotate (semantic annotation) public resources stored in the web space. Thus, the local web space creates a community-specific view over its resources, which composes a community-specific sub-space. But ACtS is only a theory. For it to be realized, there needs to be two premises:

(1) A uniform representation - Web spaces similar to what is on Web 1.0. This requires advancement on HTML encoding. In particular, this means independent HTML encoding of individual web resources.

(2) Character recognition and casting technology - A combination of information retrieval and semantic annotation methods.

Wednesday, January 30, 2008

Public Library 2.0?

Much has been discussed about the role of public libraries as they are increasingly facing budget cuts while facing greater needs for technological innovations. Some have argued that this is natural, as we have entered Library 2.0, which is all about rethinking library services in the light of re-evaluating user needs and the opportunities produced by new technologies. Although there have been great resources written about Library 2.0, there hasn't been one as thorough in its analysis of public libraries as Public Library 2.0: Towards a new mission for public libraries as a "network of community knowledge"? Chowdhury, Poulter, and McMenemy proposes Public Library 2.0, inspired by Ranganathan's famous five principles. They make great fodder for further discussion, don't they?

(1) Community knowledge is for use - Since the value of a community is the knowledge it possesses, people who leave a community will have memories. Yet, little has been carried out in public libraries to digitize local resources.

(2) Every user should have access to his or her community knowledge - Knowledge is for sharing; community knowledge becomes valuable only when it can be accessed and used by others. Facilitating the creation and wider use of this knowledge should be the new role of public libraries.

(3) All community knowledge should be made available to its users - No community knowledge should be allowed to be wasted. Rather, public libraries should facilitate the creation of such knowledge so that it is recorded and preserved. Nothing should be lost.

(4) Save the time of the user in creating and finding community knowledge - Just like the paper records of past lives, the digital records of current lives are accumulating in an ad hoc manner but in a much greater quantity and variety. Hence, public library staff should fill the role of advisors on local content creation, management, and implementation of controlled description, as well as access schemes.

(5) Local community knowledge grows continually - Because community knowledge creation is a continual process, public libraries should act as local knowledge hubs must use existing standards and technology for digitization as well as metadata for the management of, and access to, the digitized resources

Sunday, January 27, 2008

The Semantic Catalogue

It's important that librarians keep at the back of their minds how to integrate the Semantic Web into the catalogue, which is ultimately the bridge that users cross to access the library's resources. But it's easy to forget about it, particularly since many libraries have difficulty keeping up with Web 2.0 technologies. But regardless of how far we've come along, it's necessary to peer into the future and see what kinds of changes we'll need to embrace. It could be ten years down the road before we hit the Semantic Web . . . or five . . . or even less. Take a look at Campbell and Fast's Academic Libraries and the Semantic Web: What the Future May Hold for Research Supporting Library Catalogues. They make an excellent case for integrating existing web resources into a dynamic, information-rich, and user-centred catalogue.

Meshing services such as IMDB, Amazon, AFI's Catalogue, the authors suggest that academic libraries could use the Semantic Web as a source of rich metadata that can be retrieved and inserted into bibliographic records to enhance the user's information searches and to expand the role of the library cataloguer as a research tool rather than a mere locating device. (Something along the lines of the Pipl search engine technology). In doing so, the cataloguer acts as an information intermediary, using a combination of subject knowledge and information expertise to facilitate the growth of semantically encoded metadata. In a Web 3.0 world, the cataloguer's new responsibilities would include the following:

(1) Locate - RDF-encoded information on specific subjects, scrutinizing its reliability, and assessing its usefulness in meeting cataloguing objectives

(2) Select - RDF resources for the specific item being catalogue

(3) Participate - In markup projects within a specific knowledge domain, thus promoting the growth of open-access domain-specific metadata

Thursday, January 24, 2008

Google Scholar, Windows Live Academic Search, and LIS 2.0

That School of Information and Library Science at the University of North Carolina at Chapel Hill sure churns out some great theses. The latest one is Josiah Drewey's Google Scholar, Windows Live Academic Search, and Beyond: A Study of New Tools and Changing Habits in ARL Libraries offers remarkable insight into these two academic search engines. Little has been written about Windows Live Academic Search, so much so that it appears most people have forgotten about it. (Including its own creators). Drewey's paper reveals that such is not the case. It's worth a read. Here are my favourite points that Drewey makes about GS and WLAS. I'll share them with you all, it deserves some attention here:

(1) Citation Ranking - Search results are largely influenced by citation counts generated by Google's link-analysis, which means that users see the most highly cited (and therefore, the most influential) articles

(2) Citation Linking - GS rivals Web of Science and Scopus with its ability to link to each article through a "cited by" feature that allows users to see which other authors have cited that particular article. GS is superior in this aspect as it stretches into the Humanities as well.

(3) Versioning - GS compiles each different version of a particular article or other work in one place. Different versions can come from publisher's databases, preprint repositories or even faculty homepages.

(4) Open Access - GS increasingly brings previously unknown or unpublicized content to users.

(5) Ability to link to libraries - GS has the bility to link to content already paid for by libraries. Thus, search results from GS can lead directly to the libraries' databases.

(6) Federated Search Engine - Instead of searching many databases as a query is made, GS' resources are compiled prior to the search and return very quickly.

In contrast, Drewey makes some great insights into Windows Live Academic Search. Here are the main strengths of WLAS:

(1) Better interface - WLAS uses a "preview pane" to display initial search results, which the user can mouse over a citation to show the abstract in another pane to the right, whereas GS is inflexible

(2) Names of authors are hyperlinked - Search results take the user to other works by each author

(3) Citations Export - Although GS allows this, WLAS are much more easily visible to export to BibTeX, RefWorks, and EndNote

(4) User-friendly - In many ways, WLAS offers more features tailored for users. Not only does it offer RSS feeds, it enables uses to store their preferences and save search parameters. GS surprisingly does not have such features.

Pages