Sunday, March 29, 2009

Michael Stephens in Vancouver, BC

Michael Stephens is one of my favourite librarians. One of the most enjoyable things is the memories of how libraries affect a person's memories and shape a person's life. This is a very honest, intimate discussion of Stephens' love of libraries. He's coming to Vancouver for the upcoming British Columbia Library Association 2009 conference. I'm looking forward to it.

Monday, March 23, 2009

A Time To Be An Information Professional

An apothecary is a historical name for a medical professional who formulated and dispensed medicine to physicians, surgeons and patients. They were what we call the modern day pharmacist. The health profession is in hot demand, and pharmaceutical sciences is one of the most sought-after professions of college graduates.

But it wasn't always this way. Industrialization had an impact on every aspect of the activity of the apothecary. Because new advances in technology in medicine lead to the creation of new drugs, drugs that the individual pharmacist’s own resources could not produce, many drugs that the individual pharmacist was able to produce could be manufactured more economically, and in superior quality.

Not only did proprietary medicines result in the taking over the role that apothecaries were responsible for, they forced the pharmacist to become a vendor of questionable merchandise. This ultimately opened the way to much broader competition from merchants, grocers and pitchmen than the pharmacist had previously encountered, thus marginalizing the profession. Eventually, the "art of compounding" gave way to the new pharmacist's increasingly important role of being health care provider, in which the science of pharmacy turned to specializing in tailoring patients' medications to specifically meet their needs. The remaining pharmacists that do continue compounding do so for the love of the science and interest in their patients well- being. And just like the changing nature of the librarian's work, the essential love for our users and art of searching will not change.

Librarians aren't going anywhere, and they never will, even though the name might. Librarians will adapt, change, and modify - just like the apothecary. But it won't disappear. Librarians are undergoing a change in its profession, and nowhere is this most apparent than the Special Libraries Association, which is celebrating its centennial year. The SLA is a reflection of the profession, as it has often had to question its place in the profession. In 2003, the SLA came to a standstill, and almost became the Information Professionals International, but decided otherwise as SLA represents a century-old tradition and brand name that is too cherished to change.

And thus is the profession of librarianship. Perhaps we will be known by another title, another name, as some of us already are known as metadata managers, taxonomists, information architects, and knowledge managers. Library schools have evolved into I-Schools. Who knows, LIS might evolve the point where it not longer is recognizable to us -- as the apothecary is no longer recognizable to the pharmacist. But the art of searching, sharing knowledge, collecting, organizing, and disseminating information in whatever shape and form they may be, will never change. And hence, whatever we may become, we will never change.

Saturday, March 14, 2009

The Search Continues . . . .

New Approach to Search is a must read for those interested in search technology. Joe Weinman goes into the nitty-gritty of search algorithms, but boils it down into easily understandable (and fun) analogies for the laymen. As Weinman argues,

Search algorithms today are largely based on a common paradigm: link
analysis. But they've ignored a mother lode of data: The network.

Nicely said. Although there are a multitude of variations of search algorithms, architectures and tweaks, search technology has been based largely on three canonical approaches. In a nutshell, here they are:

1) Human-powered directories -
Hierarchically organized into taxonomies (e.g. Yahoo!)

2) Crawler-based index -
Generates results largely prioritized by link analysis. (e.g. Google)

3) Collaborative tagging -
Users tag pages with keywords so that future searchers can find
those pages by entering those tags (e.g. Technorati and

However, these three options still fail to prevent click fraud and also content unreacheable in the Deep Web. Weinman proposes the Network Service Providers as a fourth option, which uses data and metadata associated with the actual network transport of Web content—including HTML pages, documents, spreadsheets, almost anything —to replace and/or augment traditional Web crawlers, improve the relevance and currency of search results ranking, and reduce click fraud. A network service provider could better determine aggregate surfing behavior and hold times at sites or pages, in a way sensitive to the peculiarities of browser preferences and regardless of whether a search engine is used.

Weinman's proposal is an interesting deviation to the thoughts of Semantic Web enthusiasts. It does throw a quirk into the speculation of the future of Web search technology. And so the search continues . . .

Monday, March 09, 2009

Searching Search Like a Yandex

Let me introduce Yandex. It's an interesting search engine because it precedes Google. In fact, Yandex was founded in the late 1980s, before the advent of the Web. What is interesting is that Yandex is a classic case study that Google is not the end all and be all of search. Google may be good in English, but how does it fare in multilingual searching. (Remember: English is only a fraction of the Internet's languages).

What is interesting is that Yandex's search algorithm is rooted in the highly inflected and very peculiar Russian language. Words can take on some 20 different endings to indicate their relationship to one another. Like the many other non-English languages, this inflection makes the language of Russian precise, but makes search extremely difficult. Google fetches the exact word combination you enter into the search bar, leaving out the slightly different forms that mean similar things. However, Yandex is unique in that it does catch the inflection. Fortune has written an interesting article on Yandex, and my favourite part is its examination into the unique features of this Russian search giant:

While some of its services are similar to offerings available in the U.S. (blog rankings, online banking), it also has developed some applications that only Russians can enjoy, such as an image search engine that eliminates repeated images, a portrait filter that ferrets out faces in an image search, and a real-time traffic report that taps into users' roving cellphone signals to monitor how quickly people are moving through crowded roads in more than a dozen Russian cities.

Thursday, March 05, 2009

BBC's Semantic Web

BBC gets it.   In the latest issue of Nodalities magazine (one of my favourite reads), BBC reveals how it is applying the bottom-up approach to its contribution in realizing the SemWeb.   To make this happen, web programmers broke with BBC tradition by designing from the domain modelup rather than the interface down.  The domain model provided us with a set of objects (brands, series, episodes, versions, ondemands, broadcasts, etc) and their sometimes tangled interrelationships.

This is exciting stuff.  Without ever explicitly talking RDF we’d built a site that complied with Tim Berners-Lee’s four principles for Linked Data:

(1)  Use URIs as names for things. 

(2)  Use HTTP URIs so that people can look up those names. - 

(3)  When someone looks up a URI, provide useful information

(4)  Include links to other URIs

In fact, as the BBC web developers argue, 
considering how best to build websites we’d recommend you throw out the Photoshop and embrace Domain Driven Design and the Linked Data approach every time. Even if you never intend to publish RDF it just works.   The longer term aim of this work is to not only expose BBC data but to ensure that it is contextually linked to the wider web.  
The idea is to free web of data.

BBC Gets It.