Tuesday, June 03, 2008

Semantic Web and Librarians At Talis

I've always believed that librarians should and will play a part in the rise of the Semantic Web and Web 3.0. I've gone into the theory and conceptual components, but really haven't discussed too much about the practical elements of how librarians will realize this. Meet Talis. Besides its contribution to the blogosphere, Talis has recently dipped into publishing with its inaugural issue of Nodalities: The Magazine of the Semantic Web. It's a wonderful read - take a look.

How did Talis come about? It's been in the works for quite a while now, and it's worth noting how it came to be. In 1969 a number of libraries founded a small co-operative project, based in Birmingham to provide services that would help the libraries become more efficient. The project was known as the Birmingham Libraries Cooperative Mechanisation Project, or BLCMP. At this time the concept of automation was so new that the term mechanisation was often used in its place.

BLCMP built a co-operative catalogue of bibliographic data at the start of its work, a database that now contains many millions of records. BLCMP moved into using microfiche and later IBM mainframes with dedicated terminals at libraries in the mid-seventies and was one of the first library automation vendors to provide a GUI on top of Microsoft Windows to provide a better interface for end-users. The Integrated Library System was first called Talis. Talis became the name of the company during re-structuring and the ILS became known as Alto. In 1995 Talis was the first library systems vendor to produce a web enabled public access catalogue. Much of Talis' work now focusses on the transition of information to the web, specifically the Semantic Web and Talis have lead much of the debate about how Web 2.0 attitudes affect traditional libraries.

How does this include librarians? This ambitious Birmingham-based software company began life in the 1970s as a university spin-off. For many years it was a co-operative owned by its customers (a network of libraries), but in 1996 it was restructured as a commercial entity. It has a well-established pedigree of supplying large-scale information management systems to the public in the UK and academic libraries: in fact, more than 60% of UK public libraries now use the company's software, which benefits some 9m library users. In 2002, the company embarked on Talis 2.0, a change programme to take advantage of "the next wave of technology" (Web 2.0 and the semantic web). In the year ending March 2004, turnover was £7.5m with profits of £226,000. Who says librarians can't make a buck, right?

Saturday, May 31, 2008

Introducing WebAppeal

There are some good Web 2.0 applications and websites. Then there is WebAppeal. The web service is based on the principle of 'Software as a Service' (SaaS), which is rapidly gaining popularity. The uprise of innovative online applications makes traditional and expensive software unnecessary. Examples of successful web applications are video service YouTube and free music service Last.fm. To bring some structure and insight into these ever-growing technologies, http://www.appappeal.com/ informs consumers as comprehensively as possible about all the possibilities SaaS web applications have to offer.

Although we're in the age of Web 2.0, one of the main challenges remains information overload. Too much information does not necessarily mean knowledge. That's why I find AppAppeal to be a convincing website which provides insightful reviews of applications and indexes them according to utility. On this website, all applications are organized in categories such as "Blogging", "Personal Finance" and "Wiki Hosting". The website is still being developed. Soon, tools will be added to create an interactive community around web-based applications.

There are already Web 2.0 review sites such as Mashable, All Things Web 2.0, or Bob Stumpel's Everything 2.0. But WebAppeal goes one step further. It analyzes the advantages and disadvantages of particular applications, providing demo videos. I really like this website. It's a good complement to a project that Rex Turgano and I are collaborating on: Library Development Camp, which not only reviews Web 2.0 applications, but offers trial accounts for users to try out different applications. Together we make a great punch. Stay tuned. More to come. . .

Thursday, May 29, 2008

Day 4 of TEI/XML Bootcamp

Day 4 has come and gone. What did I learn? XML is not easy. Programming is even tough business, not for the faint of heart or mind. The main challenge that I had, and made my head spin, was learning the complexities behind XHTML and XSLT. A powerful tool for the construction of the Semantic Web is XHTML. Most people are acquainted with the "meta" tags which can be used to embed metadata about the document as a whole. Yet there are more powerful, granular techniques available too. Although largely unused by web authors, XHTML and XSLT offer numerous facilities for introducing semantic hints into markup to allow machines to infer more about the web page content than just the text. These tools include the "class" attribute, used most often with CSS stylesheets. A strict application of these can allow data to be extracted by a machine from a document intended for human consumption.

Although there have been several proposals for embedding RDF inside HTML pages, the technique of using XSLT transformations has a much broader appeal. Because not everyone is keen to learn RDF, and it thus presents a barrier to the creation of semantically rich web pages. Using XSLT provides a way for web developers to add semantic information with minimal extra effort. Dan Connolly of the W3C has conducted quite a number of experiments in this area, including HyperRDF, which extracts RDF statements from suitably marked-up XHTML pages. What can librarians do?
The Resource Description and Access is just around the corner. And there is much buzz (good and bad) that it's going to change the way librarians and catalogers think about information science and librarianship. I encourage information professionals to be aware of the changes to come. Although most are not going to be involved directly with the Semantic Web, they can keep abreast of developments, particularly exciting developments in information organization and classification. Workshops and presentations about the RDA are out in droves. Pay attention. Stay tuned. There could relevancy in these new developments that spill into the SemWeb.

Tuesday, May 27, 2008

The Digital Humanities

I am Day 2 of the Digital Humanities Summer Institute. Prior to this workshop, I had no inkling of what was digital humanities. Not anymore. The Digital Humanities, also known as Humanities Computing, is a field of study, research, teaching, and invention concerned with the intersection of computing and the disciplines of the humanities. It is methodological by nature and interdisciplinary in scope. It involves investigation, analysis, synthesis and presentation of knowledge using computational media. provides an environment ideal to discuss, to learn about, and to advance skills in new computing technologies influencing the work of those in the Arts, Humanities and Library communities.

I'm currently taking Text Encoding Fundamentals and their Application at the University of Victoria from May 26–30, 2008, taught by Julia Flanders and Syd Bauman experts in using the Text Encoding Initiative (TEI) an XML language which collectively develops and maintains a standard for the representation of texts in digital form in order to specify encoding methods for machine-readable texts. And it has been a blast. This has been the seventh year of its existence, and already it has gained the attention of academics and librarians across the world.

The DHSI takes place across a week of intensive coursework, seminar participation, and lectures. It brings together faculty, staff, and graduate student theorists, experimentalists, technologists, and administrators from different areas of the Arts, Humanities, Library and Archives communities and beyond to share ideas and methods, and to develop expertise in applying advanced technologies to activities that impact teaching, research, dissemination and preservation. What have I learned so far? Lots. But most of all, just how much XML plays in the Semantic Web. But more on that in the next posting . . . stay tuned.

Friday, May 23, 2008

One Million Dollar Semantics Challenge and API

The SemanticHacker $1Million Innovators’ Challenge and new open API for Semantic Discovery has recently launched by TextWise, LLC. The Challenge enables developers to showcase the power of TextWise’s patented Semantic Signature® technology and accelerate developing breakthrough applications.

The Challenge provides incentives to encourage creation of software prototypes and/or business plans that demonstrate commercial viability in specific industries. Are you up to the Challenge? Go to Semantichacker.com to experience the technology first-hand in our demo and learn more about how to enter the $ 1 million challenge.

But what are Semantic Signatures®? They identify concepts and assign them weights; in order words, they're the ‘DNA’ of documents which in essence become highly effective at describing what the documents are ‘about.’ Semantic Signatures® enable Web publishers and application developers to automatically embed consistent, semantically meaningful tags within their content for use in classification, organization, navigation and search.

In many ways, that's what librarians can offer in terms of information structuring and organization. Interestingly, textwise technology will have a spot at the Semantic Technology Conference in San Jose on May 21, 2008. I won't be able to attend. But if you are, could you give a write-up? I would be forever in your debt.

Thursday, May 22, 2008

Dublin Core is Dead, Long Live MODS

Jeff Beall wrote an article called Dublin Core: An Obituary. In it Beall asserts that the Dublin Core Metadata Initiative is a failed experiment. Instead, MODS is the way to go. And this was back in 2004! What is MODS? The Library of Congress' Network Development and MARC Standards Office, with interested experts, is developing a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications. As an XML schema it is intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation of original resource description records.

It includes a subset of MARC fields and uses language-based tags rather than numeric ones, in some cases regrouping elements from the MARC 21 bibliographic format. This schema is currently in draft status and is being referred to as the "Metadata Object Description Schema (MODS)". MODS is expressed using the XML schema language of the World Wide Web Consortium. The standard is maintained by the Network Development and MARC Standards Office of the Library of Congress with input from users.

Here's what MODS can do that the Dublin Core can't:
1. The element set is richer than Dublin Core
2. The element set is more compatible with library data than ONIX
3. The schema is more end user oriented than the full MARCXML schema
4. The element set is simpler than the full MARC format

In my article at the Semantic Report, I argue that the DCMI is potentially relevant to the SemWeb because implementations of Dublin Core use not only XML, but are based on the Resource Description Framework (RDF) standard. The Dublin Core is an all-encompassing project maintained by an international, cross-disciplinary group of professionals from librarianship, computer science, text encoding, the museum community, and other related fields of scholarship and practice. As part of its Metadata Element Set, the Dublin Core implements metadata tags such as title, creator, subject, access rights, and bibliographic citation, using the resource description framework and RDF Schema.

So will the Dublin Core’s role in knowledge management activity representation be significant in the emergence of the SemWeb? So far, MODS hasn't done the job. Even though it has claimed that it can do so. Is this the problem similar to the situation during ancient Chinese period of the Hundred Schools of Thought? Who will win in the end? Or which ones? Perhaps opportunities and possibilities are much higher than narrowly looking for one path for absolute knowledge. So we march on . . .

Tuesday, May 20, 2008

Post-modern business in the Free World - Open Access & Librarians

I came across this interesting article from the Vancouver Sun, Post-modern business model: It's free. Videogame company Nexon has been giving away its online games for free, and making its revenue from selling digital items that gamers use for their characters. Garden says his business is as much about psychology as it is about game design. It’s no good to sell a bunch of cool designer threads to a character who is isolated in a game, because no one will see how good he looks.
Free games can have a dozen different revenue models, from Nexon’s microtransactions to advertising, product placement within a game, power and level upgrades, or downloadable songs. However, on the question of videogames (or any other digital product) being offered to consumers for free. Much of the principles of Nexon is based on Chris Anderson's "free" concept.

“No one says you can’t make money from free." What does this mean for libraries? Especially since much of the mandates and goals of libraries are not to make money? The possibilities are there. A great number of libraries are already dipping into open access initiatives, particularly at a time when database vendors and publishers are charging arms, legs, and first-borns. With Web 2.0 technologies forming an important foundation for digital and virtual outreach opportunities, and the SemWeb on the horizon, I encourage librarians and information professionals to put on their thinking caps and think together in a collaborative environment to break down the silos of information gathering, and move towards information sharing.

Sunday, May 18, 2008

Librarian 2.0

Sometimes you just read an article, and go, I get it. A lightbulb shines brightly above you. Then you quickly turn it off to be energy saving. And quickly run to the computer to blog about it. Professionalizing knowledge sharing and communications is worthy of praising.

There’re a lot of articles that deal with the Library 2.0 mantra. But John Cullen goes beyond that, and proposes the idea that Library 2.0 should extend to the librarian. It should be Librarian 2.0. And what does that mean?

The key is developing communicative orientation: one that turns the old, tiring stereotype of library work being quiet, reflective and procedural, to one that is primarily focused on listening, engaging and developing understanding of the unique position of every individual.

In other words, just as much as technology is important to the library, we must also be alert of the changing nature of information and the profession. No longer are librarians doing the same duties repetitively and mindlessly. Web 2.0 technologies are merely the surface manifestation of L2. The opportunity is there to use this paradigm shift for us in teaching other professions how to actively engage with their service consumers. All aboard!

Friday, May 16, 2008

Search Monkey and the SemWeb

We're getting closer. Yahoo is incubating a project code-named "Search Monkey," a set of open-source tools that allow users and publishers to annotate and enhance search results associated with specific web sites. Using SearchMonkey, developers and site owners can use structured data to make Yahoo! Search results more useful and visually appealing, and drive more relevant traffic to their sites.

The new enhancements differ from Yahoo's "Shortcuts" that sometimes appear at the top of search result pages. Shortcuts are served by Yahoo whenever the search engine is confident that the shortcut links are more relevant than the other web search results on the page. Often, shortcuts highlight content from Yahoo's own network of sites.

The new enhancements can be applied to any web site. Publishers can add additional information that will be displayed with the web search result. For example, retailers can include product information, restaurants can include links to menus and reviews, local merchants can display operating hours, address, and phone information, and so on—far more information than a title, URL, and description that make up current generation search results.

Here's the exciting thing. As Search Engine Land reports:
Anyone can create an app for a web site. Yahoo is collecting the most useful apps into a gallery that you as a searcher can enable for your own Yahoo search results. For example, if you like the app that was created for LinkedIn, which shows a mini-profile of a person, you can include that app so that the mini-profiles display whenever you search on a person's name.

It's true. The SearchMonkey developer tool helps users find and construct data services that you can use to build apps. Once you've built your app, you can use it yourself and share it with others. Take a look at this :)


Wednesday, May 14, 2008

From Dublin Core to the Semantic Web

I've just published a piece in the Semantic Report titled, The Semantics of the the Dublin Core – Metadata for Knowledge Management. It's an experimental piece about the potential for applying principles from the Dublin Core Metadata Initiative for the SemWeb. In a previous article about half a year ago, Dean and I had proposed that the library catalogue could be used as a blueprint for the Semantic Web. Perhaps theoretical and conceptual, the arguments fleshed out the ideas, but not the practical applications. In this latest article, I wanted to outline in greater detail how exactly developments in library and information science are playing out, not only in the SemWeb, but for knowledge management in general.

Can the DCMI provide the infrastructure for the SemWeb? It could. Or it could not. Some have gone as far as saying that the Dublin Core is dead. But I'm not going to add more to that discourse. What I wanted to do was find apparently disparate entities: B2B, the Dublin Core, and the SemWeb, and tie them together using principles of knowledge organization in the form of the DCMI. Blasphemous? Perhaps.

My point in the article isn't to create something out of nothing. The purpose is to extend the idea that knowledge management for librarians and information science is nothing new. In 2002, two years before Tim O'Reilly's coining of the term, "Web 2.0," librarian Katherine Adams had already argued that librarians will be an essential piece to the SemWeb equation. Her seminal piece, The Semantic Web: Differentiating between Taxonomies and Ontologies, Adams argues that ontologies and taxonomies are synonymous - computer scientists refer to hierarchies of structured vocabularies as "ontology" while librarians call them "taxonomy." What the Dublin Core offers is an opportunity to bridge together different topics and extend across disciplines to navigate the complexities of the SemWeb. Fodder for discussion. But good fodder nonetheless I hope.

Monday, May 05, 2008

Library Development Camp

I'm excited to announce the formation of Library Development Camp. Our initiative is to help fellow librarians and information professionals in Canada to explore and learn about the latest web tools and technologies from colleagues who actually use them. This web community is open to any one working in the library or information management field in Canada.

How does this work? Most of the magic happens "offline" as we try to meet up in person to discuss these tools as well as give demos, training, hold discussions and debates, and share ideas and tips on how to effectively use these tools in a workplace or even on a personal level. It's all about sharing. We hope to spawn other LibraryDevCamp groups across Canada. If you would like to start one up in your city, lets us know and we'll set up a section on our web site.

Any library/information professional who already use any of these web tools/services are welcome to join and be a LibraryDevCamp.ca contributor or moderator. So far, we have an all-star cast of experts, such as Dean Giustini, Eugene Barsky, and Rex Turgano. We hope to have you join us, too. In the spirit of Web 2.0, our virtual meeting place is hosted by Moveable Type, a weblog publishing system developed by the company Six Apart. Please stay tuned as we expect our community to grow, not only in members but also in exciting ventures.

Thursday, May 01, 2008

Economics 2.0

Although I enjoyed Economics 100 (Micro and Macroeconomics) and had learned a great deal - I have to admit it wasn't the most exciting courses at time. The textbook we had used was Gregory Mankiw's Principals of Economics. (I still have copies of the textbooks). He has written two popular college-level textbooks: one in intermediate macroeconomics and the more famous Principles of Economics, which is popular among high-school Advanced Placement Economics teachers. More than one million copies of the books have been sold in seventeen languages.

Mankiw was also an important person in American politics, as he was appointed by President George W. Bush as Chairman of the Council of Economic Advisors in 2003. He has since resumed teaching at Harvard, taking over the introductory economics course Social Analysis 10 (which he affectionately refers to as "Ec. 10"). However, Mankiw also believes in using Web 2.0.

This is Mankiw's purpose for the blog:
I am a professor of economics at Harvard University, where I teach introductory economics (ec 10) among other courses. I use this blog to keep in touch with my current and former students. Teachers and students at other schools, as well as others interested in economic issues, are welcome to use this resource.

What's exciting about Mankiw's blog is the fact that it dips into the Web 2.0 blogosphere. The blog is much more than just a website. It's an intellectual and virtual space for him to keep in touch with colleagues and students, of marketing his profession and work to the non-expert. It's fantastic outreach. Librarians everywhere should take notice.

Friday, April 25, 2008

Library 2.0

Michael Casey and Laura Savastinuk's article in the Library Journal not only changed the way libraries are perceived, but also how librarians run them. In a way, Library 2.0 principles are nothing new. Interlibrary loan is very much a "long tail" concept. In fact, would it be possible to view Library 2.0 as change management in its most extreme form? Nonetheless, it was a brilliant read when the book was published. Here's what I got out of the book about Library 2.0 concepts.

(1) Plan, Implement, and Forget - Changes must be constant and purposeful. Services need to be continually evaluated.

(2) Mission Statement - A library without a clear mission is like a boat without a captain. It drives the organization, serving as a guide when selecting services for users and letting you set a clear course for Library 2.0

(3) Community Analysis - Know your users. Talk to them, have a feel for who you're serving, and who they are.

(4) Surveys & Feedback - Get both users and staff feedback. It's important to know what works and what doesn't.

(5) Team up with competitors - Don't think of the library as being in a "box." Look at what users are doing elsewhere that they could be doing through the library. Neither should bookstores or cafes or the Internet. Create a win-win relationship with local businesses that benefits everyone.

(6) Real input from staff - Having feedback means implementing ideas, and not just for show. Eventually, staff will realize the hoax, and morale will suffer.

(7) Evaluating services - Sacred cows do not necessarily need to be eliminated; however, nothing should be protected from review.

(8) Three Branches of Change model - This allows all staff - from frontline workers to the director - to understand the changes made. The three teams are: investigative, planning, and review team.

(9) Long tail - Web 2.0 concepts should be incorporated into the Library 2.0 model as much as possible. For example, the Netflix model does something few services can do: get materials into the hands of people who do not come into libraries. Think virtually as well as physically.

(10) Constant change & user participation - These two concepts form the crux of Library 2.0.

(11) Web 2.0 technologies - They give users access to a wide variety of applications that are neither installed nor approved by IT. The flexibility is there for libraries to experiment unlike ever before. It is important to have conversation where none exists before. Online applications help fill this gap.

(12) Flattened organizational structure - Directors should not make all the decisions. Instead, front line staff input should be included. Committees that include both managers and lower level staff help 'flatten' hierarchical structure, creating a more vertical structure that leads to more realistic decision-making.

Tuesday, April 22, 2008

7 Opportunities for the Semantic Web

Dan Zambonini’s 7 f(laws) of the Semantic Web is a terrific read, and perhaps offers a refreshing perspective of the challenges of realizing the SemWeb. Too often we hear a dichotomy of arguments, but Zambonini’s calmly lays out what he believes are hurdles for the SemWeb. Instead of regurgitating his points, I’m going to complement them with my own comments:

(1) Not all SemWeb data are created equal - There’s a lot of RDF files on the web, in various formats. But that doesn’t equate to the SemWeb. But this is a bit of a strawman. In fact, it emphasizes the point that the components of the SemWeb are here. The challenge is the finding the mechanism or application that can glue everything together.

(2) A Technology is only as good as developers think it is - Search analysis reveals that people are actually more interested in AJAX than RDF Schema, despite the fact that RDF has a longer history. Zambonini believes that this is because the SemWeb is so incredibly exclusive in an ivory-towerish way. I agree. However, what is to say that the SemWeb won’t be able to accommodate a broader audience in the future? We’ll just need to wait and see.

(3) Complex systems must be built from successively simpler system - I agree with this point. Google is successful in the search engine wars because it learnt how to build up slowly, and created a simple system that got more complex as it needed to. People love Web 2.0 because they’re easy to use and understand. But whereas Web 2.0 was about searching, the SemWeb should be about finding. Nobody said C+ and Java were easy, but complexity pays off in the long run.

(4) A new solution should stop an obvious pain - The SemWeb needs to prove what problems it can solve, and prove its purpose. Right now, Web 2.0 and 1.0 do a good job, so why would we need any more? Fair enough. But information is still in silos. Until we open up the data web, we’re still in many ways living in the dark.

(5) People aren’t perfect - Creating metadata and classifications is difficult. People are sloppy. Will adding SemWeb rules add to the mess that is the Web? I seriously can’t answer this one. We can only predict. But perhaps it’s too cynical to prematurely write off people’s metadata creating skills. HTML wasn’t easy, but we managed.

(6) You don’t need an ontology of everything. But it would help - Zambonini argues for a top-down ontology which would a one-fits-all solution for the entire Web rather than building from a bottom-up approach based on folksonomies of the social web. I would argue that for this to work, we need to look at it from different angles. Perhaps we can meet half way?

(7) Philanthropy isn’t commercially viable - Why would any sane organization buy into the SemWeb and expose their data? We need that killer application in order for this to work. Agree. Ebay did wonders. Let’s hope there’s a follow-up on the way.

Saturday, April 19, 2008

Four Ways to Library 2.0

Library 2.0 has stirred controversy since the day Michael Casey and Linda Savastinuk’s Library 2.0: Service for the next-generation library had hit online newsstands. A loosely defined model for a modernized form of library service that reflects a transition within the library world in the way that services are delivered to users, the concept of Library 2.0 borrows from that of Business 2.0 and Web 2.0 and follows some of the same underlying philosophies. It’s still being debated in the library community about its relevancy to the profession. (Haven’t we always had to serve our users in the first place. What’s new about that?)

Michael Stephens and Maria Collins’ Web 2.0, Library 2.0, and the Hyperlinked Library is a fascinating for those interested in learning more about these concepts. Certainly, at the core of Library 2.0 is blogs, RSS, podcasting, wikis, IM, and social networking sites. But it’s much more than that, and Stephens and Collins boils it down nicely to four main themes of Library 2.0:

(1) Conversations – The library shares plans and procedures for feedback and then responses. Transparency is real and personal.

(2) Community and Participation –
Users are involved in planning library services, evaluating those services, and suggest improvements.

(3) Experience – Satisfying to the user, Library 2.0 is about learning, discovery, and entertainment. Bans on technology and the stereotypical “shushing” are replaced by a collaborative and flexible space for new initiatives and creativity.

(4) Sharing – Providing ways for users to share as much or as little of themselves as they like, users are encourage to participate via online communities and connect virtually with the library.