Wednesday, June 03, 2009

Gates Versus Jobs



I enjoy watching these two giants go at it. Can you feel the tension and the cutting competition? This is just part two. Watch the whole series. This is a session from the All Things Digital Web 3.0 conference.

Monday, June 01, 2009

The Semantic Way

PricewaterhouseCoopers has just come out with an important document forecasting Semantic Web technologies. While PWC has usually churned out fairly solid business knowledge management-type best practice research, this particular publication is worthy of a close reading. Its feature article in particular, "Spinning a Data Web" offers an indepth and concise look into the technologies behind the SemWeb, one which LIS professionals should take heed, as many of the concepts are relevant to our profession. Why? Here are the main points which I find significantly important for us moving ahead in the race to the Semantic Web.

(1) Linked Data Initiative - In order for the Web to be move from a messy, siloed, and unregulated frontier, the SemWeb will require a standards-based approach, one which data on the Web would become interchangeable formats. By linking data together, one could find and take pieces of data sets from different places, aggregate them, and use them freely and accessibly. Because of this linking of data, the Web won't be limited to just web-based information, but ultimately to the non-Web-based world. To a certain extent, we are already experiencing this with smart technologies. Semantic technologies will help us extend this to the next version of the Web, often ambiguously dubbed Web 3.0.

(2) Resource Description Framework - RDF is key to the SemWeb as it allows for the federation of Web data and standards, one which uses XML to solve a two-dimension relational database world cannot. RDF provides a global and persistent way to link data together. RDF isn't a programming language, but a method (a metahporical "container") for organizing the mass of data on the Web, while paving the way for a fluid exchange of different standards on the Web. In doing so, data is not in cubes or tables; rather, they're in triples - subject-predicate-object combinations that provide for a a multidimensional representation and linking of the Web, connecting nodes in an otherwise disparate silo of networks.

(3) Ontologies and Taxonomies - LIS and cataloguing professionals are familiar with these concepts, as they often form the core of their work. The SemWeb moves from taxonomic to an ontological world. While ontologies describe relationships in an n-dimensional manner, easily allowing information from multiple perspectives, taxonomies are limited to hierarchical relationships. In an RDF environment, ontologies provide a capability that extends the utility of taxonomies. The beauty of ontologies is that it can be linked to another ontology to take advantage of its data in conjunction with your own. Because of this linkability, taxonomies are clearly limited as they are more classification schemes that primarily describe part-whole relationships between terms. Ontologies are the organizing, sense-making complement to graphs and metadata, and mapping among ontologies is how domain-level data become interconnected over the data Web.

(4) SPARQL and SQL - It overcomes the limits of SQL because SPARQL because graphs can receive and be converted into a number of different data formats. In contrast, the rigidness of SQL limits the use of table structures. In constructing a query, one has to have knowledge of the database schema; with the abstraction of SPARQL, this problem is solved as developers can move from one resource to another. As long as data messages in SPARQL reads within RDF, tapping into as many data sources becomes inherently possible. De-siloing data was not possible without huge investment of time and resources; with semantic technologies, anything is possible.

(5) De-siloing the Web - This means is that we would need to give up some degree of control on our own data if we wish to have a global SemWeb. This new iteration of the Web takes the page-to-page relationships of the link document Web and augments them with linked relationships between and among individual data elements. By using ontologies, we can link to data we never included in the data set before, thus really "opening" up the Web as one large global database.

Thursday, May 28, 2009

The Industrial Web

"Web 2.0 is social: many hands make light work. In stark contrast, Web 3.0 is industrial."

In the Journal of Social Computing, Peter Sweeney argues that whatever we call Web 3.0, it is going to be a
the automation of tasks which displaces human work. Our information economy is ultimately in the midst of an Industrial Revolution. He makes another excellent point:

Billions are being spent worldwide on semantic technologies to create the factories and specialized machinery for manufacturing content. Railways of linked data and standards are being laid to allow these factories to trade and co-operate. And the most productive information services in the world are those that leverage Web 3.0 industrial processes and technologies. Web 3.0 is a controversial term, as it confuses those who are just only beginning to feel comfortable with the concept Web 2.0 and those who are embracing the Semantic Web. Web 3.0 disrupts these traditional, safe thoughts. It not only blurs the terminology, it also offers business advocates an opportunity to cash in.

But I see Sweeney's arguments as a multidimensional argument that transcends nickels and dimes. He makes an excellent point when he argues that many dismiss Web 3.0 as a fad; however, when we think of the Web as a manufacturing process, that is a disruptive technology -- very much like the Industrial Revolution -- then we can begin to understand what Web 3.0 represents.

Monday, May 25, 2009

Kumos to you MSN

I'm going to hold off on adding to the Wolfram Alpha debate as I've yet to digest it all in the last week or so. But hold on. We might need to pen new articles -- all of us. Microsoft has added its two cents with an upcoming new search engine called Bing (but codenamed Kumo) .

Bing is a combination of Microsoft's Live Search search engine and semantic Web technology (which Microsoft had quietly acquired in Powerset last July, 2008). It is said that Kumo is designed as a "Google killer" in mind. However, not without a cost.

It's been reported that the amount of resources Microsoft had spent on Kumo has caused deep divisions within the vendor's management. Many within the hierarchical monolith are arguing for staying put with the companie's money-making ways rather than spreading it elsewhere on fruitless desire for the holy search grail.

This is important new developments for information professionals - especially librarians - to take note. While the Semantic Web adds structure to Web searches in the backend technology, what users will see in the front end is increased structure such as the search results in the center of the page and a hierarchical organization of concepts or attributes in the left (or right)-hand column. This could be what Bing ultimately looks like.

What this implies is that with so much of the spotlight currently on "practical" social media and Web 2.0 applications, much is happening underneath the surface among the information giants. Google itself is quietly conducting much research into the SemWeb. Who will be the first to achieve Web sainthood? Until last week, we thought it was these guys.

Wednesday, May 20, 2009

The Web 3.0 Hoopla

Web 3.0-ites beware. As information professionals, it's our jobs (and hobbies to a certain extent) to pick out discrepancies and the latest trends on the web. A web 3.0 conference took place in New York City, May 19-20. The conference featured speakers such as Christine Connors, and a fairly large list of technology evangelists and business experts. The conference packages Web 3.0 as a a group of technologies that make the organization of information radically more fluid and allow for new types of analysis based on things like text semantics, machine learning, and what we call serendipity — the stumbling upon insights based on just having better organized and connected information. Its website presents the following:
In turbulent economic times, it is critically important to understand what opportunities exist to make our businesses run better. The emergence of a new era of technologies, collectively known as Web 3.0, provides this kind of strategically significant opportunity.

The core idea behind web 3.0 is to extract much more meaningful, actionable insight from information. At the conference, we will explore how companies are using these technologies today, and should be using them tomorrow, for significant bottom line impact in areas like marketing, corporate information management, customer service, and personal productivity.

I would be hesitant to accept this definition of Web 3.0, particularly when the words "in turbulent economic times." It's awfully reminiscent of how Web 2.0 had started: the burst of the dot-c0m economy in 2001, which lead to programmers convening at the first Web 2.0 conference. For better or worse, Web 2.0 was born; but it was never endorsed by academia. The creators of the internet never envisioned for Web 2.0 technologies; the World Wide Web Consortium (W3C) never had Web 2.0 standards. Rather, the Semantic Web has its roots from the very beginning.

Unfortunately, I fear the same is happening with Web 3.0. Much is being slapped by corporate and technology interests and labelled "Web 3.0." Because of the downturn in the economy, information professionals beware.

Thursday, May 07, 2009

Swine Flu and the World Wide Web Scour

As I was flipping through the pages of the morning paper, the Public Health Agency of Canada Intelligence Network certainly made my personal headlines. The power of the software is so that two powerful news aggregators - Al Bawaba and Factiva- are used by the Canadian system in order to retrieve relevant articles every 15 minutes, day and night.

The Public Health Agency of Canada group, whose Web-scouring programs also found the earliest portent of the arrival of SARS, though it took months for Chinese authorities to confirm the presence of that virus.

In fact, more than half of the 578 outbreaks identified by the World Health Organization between 1998 and 2001 were first picked up by the Canadian system. What this really reveals is that the Web is an ecological organism, a metaphor for reality, if you. It's amazingly disconcerting when we realize just how primitive our search mechanisms are like, when vital health information slips through our radars. Just how much difference do such surveillance systems really make in combatting emerging disease? Well, let's look at it this way -- the new swine flu strain was discovered - in the United States - a week after the La Gloria story surfaced, and it was another 10 days before a Canadian lab determined the same virus was making people ill in Mexico. In fact, the Global Public Health Intelligence Network (GPHIN) first detected reports of an unusual outbreak of respiratory disease in China's Guangdong province months, months before the SARS spread around the world. This is the power of the Web, this is the power of search when maximized to its potential.

Wednesday, April 29, 2009

Twittering the Digu Way

If you dont' know by now, Twitter is a free micro-blogging service that allows its users to send and read other users' updates known as tweets -- text-based posts of up to 140 characters in length which are displayed on the user's profile page and delivered to other users who have subscribed to them. It's being used by everyone, from the British Airways to Barack Obama. But we must remember that Twitter is mainly for English-users - a large population of this world don't converse or even use English in their everyday lingua franca.

While Twitter is often regarded as an information network for distributing and exchanging information, in China, users rarely surf the net for information. The Web in China is not a Tool for people’s daily life, but rather a venue for entertainment and relaxation. Not surprisingly, blogging is also viewed in such a way.

Digu is such an example of how microblogging works in China. Digo, a microblogging service from Shenzhen is designed in such a way that it is deliberately entertainment-centric. It's even got a Celebrities’ Digu channel where users can follow 62 Chinese celebrities. What does this mean for us out here in the West? Nothing, we just twitter along. But we must be aware that despite the global Web 2.0 phenomenon, we are still geographically silos in language and culture. We might be information-rich, but we are not pluralistic in knowledge as we may think. Information professionals beware!

Tuesday, April 21, 2009

World Digital Library Coming to a Computer Near You!

This is what the future of libraries will be like. I'm excited at the unveiling of the new World Digital Library. An Internet library aimed to be accessible to surfers around the world is now on line, with its formal inauguration in Paris on Tuesday. The latest in increasing international efforts to digitize cultural heritage, the World Digital Library is combination of contributions from libraries around the world.. Developed by the Library of Congress in Washington, with the help of the Alexandria Library in Egypt, the Library was launched at the Paris headquarters of the United Nations Educational, Scientific and Cultural Organization.

The Library not only offers an array of books, maps, manuscripts and films from around the world, in seven different languages, it ultimately aims to bridge a cultural divide not only by offering people in poorer countries the same access to knowledge as those in richer ones - but also by making available the cultural heritage of Asian, Africa, Middle Eastern, and Latin American cultures.

Friday, April 10, 2009

The Waves of Cellphones Use



I recently attended a fascinating talk, which proposed the idea that Web 2.0 is a commodification of knowledge. What a thought! As information professionals, we play with information, we search information, we ultimately depend on information. But at what point do we realize the overload and the technology might be harmful. This video from Dailymotion is hitting the webosphere, and is gathering storm. It might be fun and games for now. But do we need to sit back and think more clearly about the harmful implications of technology?

Sunday, March 29, 2009

Michael Stephens in Vancouver, BC



Michael Stephens is one of my favourite librarians. One of the most enjoyable things is the memories of how libraries affect a person's memories and shape a person's life. This is a very honest, intimate discussion of Stephens' love of libraries. He's coming to Vancouver for the upcoming British Columbia Library Association 2009 conference. I'm looking forward to it.