Tuesday, July 29, 2008

WHATWG?

I've written about the potential of Resource Description & Access playing a role in the Semantic Web, and the importance of librarians in this development. Not only that, but Resource Description Framework would be the crux of this new Web. Brett Bonfield, a graduate student in the LIS program at Drexel University, intern at the Lippincott Library at the University of Pennsylvania and an aspiring academic librarian, has pointed out that the WHATWG, "Web Hypertext Application Technology Working Group," is a growing community of people interested in evolving the Web. It focuses primarily on the development of HTML and APIs needed for Web applications -- might have some influence in how things will play out.


The WHATWG was founded by individuals of Apple, the Mozilla Foundation, and Opera Software in 2004, after a W3C workshop. Apple, Mozilla and Opera were becoming increasingly concerned about the W3C’s direction with XHTML, lack of interest in HTML and apparent disregard for the needs of real-world authors. So, in response, these organisations set out with a mission to address these concerns and the Web Hypertext Application Technology Working Group was born.

There was a time when RDF’s adoption would have been a given, when the W3C was seen as nearly infallible. Its standards had imperfections, but their openness, elegance, and ubiquity made it seem as though the Semantic Web was just around the corner. Unfortunately, that future has yet to arrive: we’re still waiting on the next iteration of basic specs like CSS; W3C bureaucracy persuaded the developers of Atom to publish their gorgeous syndication spec with IETF instead of W3C; and, perhaps most alarmingly, the perception that W3C’s HTML Working Group was dysfunctional encouraged Apple, Mozilla, and Opera to team with independent developers in establishing WHATWG to create HTML’s successor spec independently from the W3C. As more non-W3C protocols took on greater prominence, W3C itself seemed to be suffering a Microsoft-like death of a thousand cuts.

This is interesting indeed. As Bonfield reveals, on April 9, WHATWG’s founders proposed to W3C that it build its HTML successor on WHATWG’s draft specification. On May 9, W3C agreed. W3C may never again be the standard bearer it once was, but this is compelling evidence that it is again listening to developers and that developers are responding. The payoff in immediate gratification—the increased likelihood of a new and better HTML spec—is important, but just as important is the possibility of renewed faith in W3C and its flagship project, the Semantic Web. Things are moving along just fine, I think.

Fascinating. There're two roads that lead to the same path. But the question remains. Are we any closer to the SemWeb?

Tuesday, July 22, 2008

Web 3.0 in 600 words

I've just penned an article on Web 3.0 from a librarian's standpoint. In my article, What is Web 3.0? The Next Generation Web: Search Context for Online Information, I lay out what I believe are the essential ingredients of Web 3.0. (Note I don't believe the SemWeb and Web 3.0 are synonymous even though some may believe them to be so - and I explain why). Writing it challenged me tremendously in coming to grips with what exactly constitutes Web 3.0. It forced me to think more concisely and succinctly about the different elements that bring it together.

It's conceptual; therefore, it's murky. And as a result, we overlook the main elements which are already in place. One of the main points I make is, whereas Web 2.0 is about information overload, Web 3.0 will be about regaining control. So, without further adieu, please take a look at this article, and let me know your thoughts. The article should not leave out the excellent help of the legendary librarian, the Google Scholar, Dean. He helped me out a great deal in fleshing out these ideas. Thanks DG.

Sunday, July 20, 2008

Web 3.0 and Web Parsing

Ever thought how Web 3.0 and the SemWeb can read webpages in an automated, intelligent fashion? Take a look at how Website Parse Template (WPT) works. WPT is an XML based open format which provides HTML structure description of website pages. WPT format allows web crawlers to generate Semantic Web RDFs for web pages.

Website Parse Template consists of three main entities:

1) Ontologies - The content creator defines concepts and relations which are used in on the website.

2) Templates - The creator provides templates for groups of web pages which are similar by their content category and structure. Publisher provides the HTML elements’ XPath or TagIDs and links with website Ontology concepts

3)
URLs - The creator provides URL Patterns which collect the group of web pages linking them to "Parse Template". In the URLs section publisher can separate form URLs the part as a concept and link to website Ontology.

Friday, July 18, 2008

Kevin Kelly on Web 3.0




At the Northern California Grantmakers & The William and Flora Hewlett Foundation Present: Web & Where 2.0+ on Feb. 14th, 2008, Kevin Kelly talks about Web 3.0. Have a good weekend everyone. Enjoy.

Thursday, July 17, 2008

EBSCO in a 2.0 World

EBSCOhost 2.0 is here. It's got a brand new look and feel, based on extensive user testing and feedback, and provides users with a powerful, clean and intuitive interface available. This is the first redesign of the EBSCOhost interface since 2002, and its functionality incorporates the latest technological advances.

1) Take a look at EBSCOhost 2.0 Flash demonstration here.

2) It's also got a spiffy marketing web site also features new EBSCOhost 2.0 web pages, where you can learn more about its key features, here. (http://www.ebscohost.com/2.0)

EBSCO has really moved into the 2.0 world: simple, clean, and Googleized. But perhaps that's the way that information services need to go. We simply must keep up. I had gone to a presentation at Seattle SLA '08, and EBSCO gave an excellent presentation (not to mention a lunch) in which it showed the 2.0-features of the new EBSCO interface. In essence, it's customizable for users: you can have it as simple as a search box or as complex as it is currenly. The retrieval aspects have not changed that much. Yet, perception is everything don't you think?

Wednesday, July 09, 2008

Why Be a Librarian?

There seems to be a real fear by some to be called 'librarians.' There's a mysterious aura around what a librarian does. In fact, some have cloaked their librarian status as 'metadata specialist' or 'information specialist' or even 'taxonomist.' Why be a librarian? That's a good question. I like some of the answers offered by Singapore Library Association's Be A Librarian :

As technology allows the storage and uploading of information at ever greater speeds and quantities, people are becoming oerwhelmed by the “information overload”. The information professional is a much needed guide to aid people in their search for knowledge.

The librarian learns to seek, organize and locate information from a wide variety of sources, from print materials such as books and magazines to electronic databases. This knowledge is needed by all industries and fields, allowing librarians flexibility in choosing their working environments and in developing their areas of expertise.

The librarian keeps apace with the latest technological advances in the course of their work. They are web authors, bloggers, active in Second Life. They release podcasts, produce online videos and instant message their users. The librarian rides at the forefront of the technology wave, always looking out for new and better ways to organize and retrieve information
for their users.

At the same time, librarians remember their roots, in traditional print and physical libraries, and continue to acquire and preserve books, journals and other physical media for their current users and for future generations.

Well said. I like it!

Tuesday, July 08, 2008

Expert Searching in a SemWeb World

If we are to move into a Web 3.0 SemWeb-based world, taking a closer look at initiatives such as Expert System makes sense. This company is a provider of semantic software, which discovers, classifies and interprets text information. I like the approach it's taking, by offering a free online seminar to make its pitch. In "Making Search Work for a Living," the webinar shows users how to improve searching. Here's what it is:

As an analysts or knowledge worker you are busy everyday searching for information, often in onerous and time consuming ways. The goal of course is to locate the strategic knuggets of information and insight that answer questions, contribute to reports and inform all levels of management. Yet current search technology proves to be a blunt tool for this task. What you are looking for is trapped in the overwhelming amount of information available to you in an endless parade of formats and forced user interfaces. Immediate access to strategic information is the key to support monitoring, search, analysis and automatic correlation of information.

Join this presentation and roundtable discussion with Expert System on semantic technology that solves this every day, every business problem.

This is a free webinar brought to you by Expert System.
To register send an e-mail to webinar@expertsystem.net

  • You are looking for a semantic indexing, search and analysis innovative tool to manage your strategic internal and external information.
  • You want to overcome the limits of traditional search systems to manage the contents of large quantities of text.
  • You have ever wondered how you can improve the effectiveness of the decision making process in your company.

DATE/TIME: July 10th 2008, 9:00 am PT, 12:00 pm ET USA; 5:00 pm UK.
Duration: 60 Minutes
Focus On: semantics as a leading technology to understand, search, retrieve, and analyze strategic contents.

The webinar will teach how to:

  • Conceptualize search and analysis on multilingual knowledge bases;
  • Investigate the documents in an interactive way through an intuitive web interface;
  • Highlight all the relations, often unexpected, that link the elements across the documents.
  • Monitor specific phenomena constantly and then easily generate and distribute ways for others to understand them.

It's worth a look-see, I think.

Sunday, July 06, 2008

End of Science? End of Theory?

Chris Anderson has done it again, this time with an article about the end of theory. How? In short: raw data. In End of Theory, he believes that with massive data, the millennial-long scientific model of hypothesize, model, test is becoming obsolete, Anderson believes.


Consider physics: Newtonian models were crude approximations of the truth (wrong at the atomic level, but still useful). A hundred years ago, statistically based quantum mechanics offered a better picture — but quantum mechanics is yet another model, and as such it, too, is flawed, no doubt a caricature of a more complex underlying reality. The reason physics has drifted into theoretical speculation about n dimensional grand unified models over the past few decades (the "beautiful story" phase of a discipline starved of data) is that we don't know how to run the experiments that would falsify the hypotheses — the energies are too high, the accelerators too expensive, and so on.

And according to Anderson, biology is heading in the same direction. What does this say about science and humanity? In February, the National Science Foundation announced the Cluster Exploratory, a program that funds research designed to run on a large-scale distributed computing platform developed by Google and IBM in conjunction with six pilot universities. The cluster will consist of 1,600 processors, several terabytes of memory, and hundreds of terabytes of storage, along with the software, including IBM's Tivoli and open source versions of Google File System and MapReduce.

Anderson's been right before. See Long Tail and Free. But this one's just speculation of course. Perhaps one commentator hit the point when he says, "Yeah, whatever. We still can't get a smart phone with all the bells and whistles to be able to use any where in the world with over 2 hours worth of use and talk time...so get back to me when you've perfected all of that." Well said. Let's wait and see some more.

Tuesday, July 01, 2008

Catalogue 2.0

It's blogs like Web 2.0 Catalog that keep me going. Catalogues have been the crux of librarianship, from the card catalogue to the OPAC. But for libraries, the catalogue has always seemed to be a separate entity. It's as if there is a dichotomy: the Social Web and the catalogue -- from there, the twine shall never meet. What would a dream catalogue look like to me? I have 8 things I’d like to see. Notice that it’s not out of the stretch of imagination. Here they are:

(1) Wikipedia – What better way to get the most updated information for a resource than the collective intelligence of the Web? Can we integrate this into the OPAC records? We should try.

(2) Blog – “Blog-noting” as I call it. To a certain extent, some catalogues already allow users to scribble comments on records. But blog-noting allows users to actually write down reflections of what they think of the resource. The catalogue should be a “conversation” among users.

(3) Amazon.ca - Wouldn’t it be nice to have an idea what a book costs out on the open market? And wouldn’t it make sense to throw in an idea of how much the used cost would be?

(4) Worldcat - Now that you know the price, wouldn’t it be useful to have an idea of what other libraries carry the book?

(5) Google-ability – OPAC resources are often online, but “hidden” in the deep web. If opened up to search engines, it makes it that much accessible.

(6) Social bookmarking – If the record is opened to the Web, then it naturally makes sense to be linked to Delicious, Refshare & Citulike (or similar bibliographic management service).

(7) Cataloguer’s paradise – Technical servicemen and women are often hidden in the pipelines of the library system, their work often unrecognized. These brave men and women should have their profiles right on the catalogue, for everyone to see, to enjoy. Makes for good outreach, too. (Photo is optional).

(8) Application Programming Interface - API's are sets of declarations of the functions (or procedures) that an operating system, library or service provides to support requests made by computer programs. It's like the interoperable sauce which adds taste to web service. It's the crux of Web 2.0, and will be important for the Semantic Web when the Open Web will finally arrive. As a result, API's need to be explored in detail by OPACs, for ways to integrate different programs and provide open data for reuse for others.

Are these ideas out of the realms of possibility? Your thoughts?