Saturday, March 14, 2009

The Search Continues . . . .

New Approach to Search is a must read for those interested in search technology. Joe Weinman goes into the nitty-gritty of search algorithms, but boils it down into easily understandable (and fun) analogies for the laymen. As Weinman argues,

Search algorithms today are largely based on a common paradigm: link
analysis. But they've ignored a mother lode of data: The network.

Nicely said. Although there are a multitude of variations of search algorithms, architectures and tweaks, search technology has been based largely on three canonical approaches. In a nutshell, here they are:

1) Human-powered directories -
Hierarchically organized into taxonomies (e.g. Yahoo!)

2) Crawler-based index -
Generates results largely prioritized by link analysis. (e.g. Google)

3) Collaborative tagging -
Users tag pages with keywords so that future searchers can find
those pages by entering those tags (e.g. Technorati and Del.icio.us)

However, these three options still fail to prevent click fraud and also content unreacheable in the Deep Web. Weinman proposes the Network Service Providers as a fourth option, which uses data and metadata associated with the actual network transport of Web content—including HTML pages, documents, spreadsheets, almost anything —to replace and/or augment traditional Web crawlers, improve the relevance and currency of search results ranking, and reduce click fraud. A network service provider could better determine aggregate surfing behavior and hold times at sites or pages, in a way sensitive to the peculiarities of browser preferences and regardless of whether a search engine is used.

Weinman's proposal is an interesting deviation to the thoughts of Semantic Web enthusiasts. It does throw a quirk into the speculation of the future of Web search technology. And so the search continues . . .

No comments: