Thursday, May 22, 2008

Dublin Core is Dead, Long Live MODS

Jeff Beall wrote an article called Dublin Core: An Obituary. In it Beall asserts that the Dublin Core Metadata Initiative is a failed experiment. Instead, MODS is the way to go. And this was back in 2004! What is MODS? The Library of Congress' Network Development and MARC Standards Office, with interested experts, is developing a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications. As an XML schema it is intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation of original resource description records.

It includes a subset of MARC fields and uses language-based tags rather than numeric ones, in some cases regrouping elements from the MARC 21 bibliographic format. This schema is currently in draft status and is being referred to as the "Metadata Object Description Schema (MODS)". MODS is expressed using the XML schema language of the World Wide Web Consortium. The standard is maintained by the Network Development and MARC Standards Office of the Library of Congress with input from users.

Here's what MODS can do that the Dublin Core can't:
1. The element set is richer than Dublin Core
2. The element set is more compatible with library data than ONIX
3. The schema is more end user oriented than the full MARCXML schema
4. The element set is simpler than the full MARC format

In my article at the Semantic Report, I argue that the DCMI is potentially relevant to the SemWeb because implementations of Dublin Core use not only XML, but are based on the Resource Description Framework (RDF) standard. The Dublin Core is an all-encompassing project maintained by an international, cross-disciplinary group of professionals from librarianship, computer science, text encoding, the museum community, and other related fields of scholarship and practice. As part of its Metadata Element Set, the Dublin Core implements metadata tags such as title, creator, subject, access rights, and bibliographic citation, using the resource description framework and RDF Schema.

So will the Dublin Core’s role in knowledge management activity representation be significant in the emergence of the SemWeb? So far, MODS hasn't done the job. Even though it has claimed that it can do so. Is this the problem similar to the situation during ancient Chinese period of the Hundred Schools of Thought? Who will win in the end? Or which ones? Perhaps opportunities and possibilities are much higher than narrowly looking for one path for absolute knowledge. So we march on . . .

Tuesday, May 20, 2008

Post-modern business in the Free World - Open Access & Librarians

I came across this interesting article from the Vancouver Sun, Post-modern business model: It's free. Videogame company Nexon has been giving away its online games for free, and making its revenue from selling digital items that gamers use for their characters. Garden says his business is as much about psychology as it is about game design. It’s no good to sell a bunch of cool designer threads to a character who is isolated in a game, because no one will see how good he looks.
Free games can have a dozen different revenue models, from Nexon’s microtransactions to advertising, product placement within a game, power and level upgrades, or downloadable songs. However, on the question of videogames (or any other digital product) being offered to consumers for free. Much of the principles of Nexon is based on Chris Anderson's "free" concept.

“No one says you can’t make money from free." What does this mean for libraries? Especially since much of the mandates and goals of libraries are not to make money? The possibilities are there. A great number of libraries are already dipping into open access initiatives, particularly at a time when database vendors and publishers are charging arms, legs, and first-borns. With Web 2.0 technologies forming an important foundation for digital and virtual outreach opportunities, and the SemWeb on the horizon, I encourage librarians and information professionals to put on their thinking caps and think together in a collaborative environment to break down the silos of information gathering, and move towards information sharing.

Sunday, May 18, 2008

Librarian 2.0

Sometimes you just read an article, and go, I get it. A lightbulb shines brightly above you. Then you quickly turn it off to be energy saving. And quickly run to the computer to blog about it. Professionalizing knowledge sharing and communications is worthy of praising.

There’re a lot of articles that deal with the Library 2.0 mantra. But John Cullen goes beyond that, and proposes the idea that Library 2.0 should extend to the librarian. It should be Librarian 2.0. And what does that mean?

The key is developing communicative orientation: one that turns the old, tiring stereotype of library work being quiet, reflective and procedural, to one that is primarily focused on listening, engaging and developing understanding of the unique position of every individual.

In other words, just as much as technology is important to the library, we must also be alert of the changing nature of information and the profession. No longer are librarians doing the same duties repetitively and mindlessly. Web 2.0 technologies are merely the surface manifestation of L2. The opportunity is there to use this paradigm shift for us in teaching other professions how to actively engage with their service consumers. All aboard!

Friday, May 16, 2008

Search Monkey and the SemWeb

We're getting closer. Yahoo is incubating a project code-named "Search Monkey," a set of open-source tools that allow users and publishers to annotate and enhance search results associated with specific web sites. Using SearchMonkey, developers and site owners can use structured data to make Yahoo! Search results more useful and visually appealing, and drive more relevant traffic to their sites.

The new enhancements differ from Yahoo's "Shortcuts" that sometimes appear at the top of search result pages. Shortcuts are served by Yahoo whenever the search engine is confident that the shortcut links are more relevant than the other web search results on the page. Often, shortcuts highlight content from Yahoo's own network of sites.

The new enhancements can be applied to any web site. Publishers can add additional information that will be displayed with the web search result. For example, retailers can include product information, restaurants can include links to menus and reviews, local merchants can display operating hours, address, and phone information, and so on—far more information than a title, URL, and description that make up current generation search results.

Here's the exciting thing. As Search Engine Land reports:
Anyone can create an app for a web site. Yahoo is collecting the most useful apps into a gallery that you as a searcher can enable for your own Yahoo search results. For example, if you like the app that was created for LinkedIn, which shows a mini-profile of a person, you can include that app so that the mini-profiles display whenever you search on a person's name.

It's true. The SearchMonkey developer tool helps users find and construct data services that you can use to build apps. Once you've built your app, you can use it yourself and share it with others. Take a look at this :)


Wednesday, May 14, 2008

From Dublin Core to the Semantic Web

I've just published a piece in the Semantic Report titled, The Semantics of the the Dublin Core – Metadata for Knowledge Management. It's an experimental piece about the potential for applying principles from the Dublin Core Metadata Initiative for the SemWeb. In a previous article about half a year ago, Dean and I had proposed that the library catalogue could be used as a blueprint for the Semantic Web. Perhaps theoretical and conceptual, the arguments fleshed out the ideas, but not the practical applications. In this latest article, I wanted to outline in greater detail how exactly developments in library and information science are playing out, not only in the SemWeb, but for knowledge management in general.

Can the DCMI provide the infrastructure for the SemWeb? It could. Or it could not. Some have gone as far as saying that the Dublin Core is dead. But I'm not going to add more to that discourse. What I wanted to do was find apparently disparate entities: B2B, the Dublin Core, and the SemWeb, and tie them together using principles of knowledge organization in the form of the DCMI. Blasphemous? Perhaps.

My point in the article isn't to create something out of nothing. The purpose is to extend the idea that knowledge management for librarians and information science is nothing new. In 2002, two years before Tim O'Reilly's coining of the term, "Web 2.0," librarian Katherine Adams had already argued that librarians will be an essential piece to the SemWeb equation. Her seminal piece, The Semantic Web: Differentiating between Taxonomies and Ontologies, Adams argues that ontologies and taxonomies are synonymous - computer scientists refer to hierarchies of structured vocabularies as "ontology" while librarians call them "taxonomy." What the Dublin Core offers is an opportunity to bridge together different topics and extend across disciplines to navigate the complexities of the SemWeb. Fodder for discussion. But good fodder nonetheless I hope.

Monday, May 05, 2008

Library Development Camp

I'm excited to announce the formation of Library Development Camp. Our initiative is to help fellow librarians and information professionals in Canada to explore and learn about the latest web tools and technologies from colleagues who actually use them. This web community is open to any one working in the library or information management field in Canada.

How does this work? Most of the magic happens "offline" as we try to meet up in person to discuss these tools as well as give demos, training, hold discussions and debates, and share ideas and tips on how to effectively use these tools in a workplace or even on a personal level. It's all about sharing. We hope to spawn other LibraryDevCamp groups across Canada. If you would like to start one up in your city, lets us know and we'll set up a section on our web site.

Any library/information professional who already use any of these web tools/services are welcome to join and be a LibraryDevCamp.ca contributor or moderator. So far, we have an all-star cast of experts, such as Dean Giustini, Eugene Barsky, and Rex Turgano. We hope to have you join us, too. In the spirit of Web 2.0, our virtual meeting place is hosted by Moveable Type, a weblog publishing system developed by the company Six Apart. Please stay tuned as we expect our community to grow, not only in members but also in exciting ventures.

Thursday, May 01, 2008

Economics 2.0

Although I enjoyed Economics 100 (Micro and Macroeconomics) and had learned a great deal - I have to admit it wasn't the most exciting courses at time. The textbook we had used was Gregory Mankiw's Principals of Economics. (I still have copies of the textbooks). He has written two popular college-level textbooks: one in intermediate macroeconomics and the more famous Principles of Economics, which is popular among high-school Advanced Placement Economics teachers. More than one million copies of the books have been sold in seventeen languages.

Mankiw was also an important person in American politics, as he was appointed by President George W. Bush as Chairman of the Council of Economic Advisors in 2003. He has since resumed teaching at Harvard, taking over the introductory economics course Social Analysis 10 (which he affectionately refers to as "Ec. 10"). However, Mankiw also believes in using Web 2.0.

This is Mankiw's purpose for the blog:
I am a professor of economics at Harvard University, where I teach introductory economics (ec 10) among other courses. I use this blog to keep in touch with my current and former students. Teachers and students at other schools, as well as others interested in economic issues, are welcome to use this resource.

What's exciting about Mankiw's blog is the fact that it dips into the Web 2.0 blogosphere. The blog is much more than just a website. It's an intellectual and virtual space for him to keep in touch with colleagues and students, of marketing his profession and work to the non-expert. It's fantastic outreach. Librarians everywhere should take notice.

Friday, April 25, 2008

Library 2.0

Michael Casey and Laura Savastinuk's article in the Library Journal not only changed the way libraries are perceived, but also how librarians run them. In a way, Library 2.0 principles are nothing new. Interlibrary loan is very much a "long tail" concept. In fact, would it be possible to view Library 2.0 as change management in its most extreme form? Nonetheless, it was a brilliant read when the book was published. Here's what I got out of the book about Library 2.0 concepts.

(1) Plan, Implement, and Forget - Changes must be constant and purposeful. Services need to be continually evaluated.

(2) Mission Statement - A library without a clear mission is like a boat without a captain. It drives the organization, serving as a guide when selecting services for users and letting you set a clear course for Library 2.0

(3) Community Analysis - Know your users. Talk to them, have a feel for who you're serving, and who they are.

(4) Surveys & Feedback - Get both users and staff feedback. It's important to know what works and what doesn't.

(5) Team up with competitors - Don't think of the library as being in a "box." Look at what users are doing elsewhere that they could be doing through the library. Neither should bookstores or cafes or the Internet. Create a win-win relationship with local businesses that benefits everyone.

(6) Real input from staff - Having feedback means implementing ideas, and not just for show. Eventually, staff will realize the hoax, and morale will suffer.

(7) Evaluating services - Sacred cows do not necessarily need to be eliminated; however, nothing should be protected from review.

(8) Three Branches of Change model - This allows all staff - from frontline workers to the director - to understand the changes made. The three teams are: investigative, planning, and review team.

(9) Long tail - Web 2.0 concepts should be incorporated into the Library 2.0 model as much as possible. For example, the Netflix model does something few services can do: get materials into the hands of people who do not come into libraries. Think virtually as well as physically.

(10) Constant change & user participation - These two concepts form the crux of Library 2.0.

(11) Web 2.0 technologies - They give users access to a wide variety of applications that are neither installed nor approved by IT. The flexibility is there for libraries to experiment unlike ever before. It is important to have conversation where none exists before. Online applications help fill this gap.

(12) Flattened organizational structure - Directors should not make all the decisions. Instead, front line staff input should be included. Committees that include both managers and lower level staff help 'flatten' hierarchical structure, creating a more vertical structure that leads to more realistic decision-making.

Tuesday, April 22, 2008

7 Opportunities for the Semantic Web

Dan Zambonini’s 7 f(laws) of the Semantic Web is a terrific read, and perhaps offers a refreshing perspective of the challenges of realizing the SemWeb. Too often we hear a dichotomy of arguments, but Zambonini’s calmly lays out what he believes are hurdles for the SemWeb. Instead of regurgitating his points, I’m going to complement them with my own comments:

(1) Not all SemWeb data are created equal - There’s a lot of RDF files on the web, in various formats. But that doesn’t equate to the SemWeb. But this is a bit of a strawman. In fact, it emphasizes the point that the components of the SemWeb are here. The challenge is the finding the mechanism or application that can glue everything together.

(2) A Technology is only as good as developers think it is - Search analysis reveals that people are actually more interested in AJAX than RDF Schema, despite the fact that RDF has a longer history. Zambonini believes that this is because the SemWeb is so incredibly exclusive in an ivory-towerish way. I agree. However, what is to say that the SemWeb won’t be able to accommodate a broader audience in the future? We’ll just need to wait and see.

(3) Complex systems must be built from successively simpler system - I agree with this point. Google is successful in the search engine wars because it learnt how to build up slowly, and created a simple system that got more complex as it needed to. People love Web 2.0 because they’re easy to use and understand. But whereas Web 2.0 was about searching, the SemWeb should be about finding. Nobody said C+ and Java were easy, but complexity pays off in the long run.

(4) A new solution should stop an obvious pain - The SemWeb needs to prove what problems it can solve, and prove its purpose. Right now, Web 2.0 and 1.0 do a good job, so why would we need any more? Fair enough. But information is still in silos. Until we open up the data web, we’re still in many ways living in the dark.

(5) People aren’t perfect - Creating metadata and classifications is difficult. People are sloppy. Will adding SemWeb rules add to the mess that is the Web? I seriously can’t answer this one. We can only predict. But perhaps it’s too cynical to prematurely write off people’s metadata creating skills. HTML wasn’t easy, but we managed.

(6) You don’t need an ontology of everything. But it would help - Zambonini argues for a top-down ontology which would a one-fits-all solution for the entire Web rather than building from a bottom-up approach based on folksonomies of the social web. I would argue that for this to work, we need to look at it from different angles. Perhaps we can meet half way?

(7) Philanthropy isn’t commercially viable - Why would any sane organization buy into the SemWeb and expose their data? We need that killer application in order for this to work. Agree. Ebay did wonders. Let’s hope there’s a follow-up on the way.

Saturday, April 19, 2008

Four Ways to Library 2.0

Library 2.0 has stirred controversy since the day Michael Casey and Linda Savastinuk’s Library 2.0: Service for the next-generation library had hit online newsstands. A loosely defined model for a modernized form of library service that reflects a transition within the library world in the way that services are delivered to users, the concept of Library 2.0 borrows from that of Business 2.0 and Web 2.0 and follows some of the same underlying philosophies. It’s still being debated in the library community about its relevancy to the profession. (Haven’t we always had to serve our users in the first place. What’s new about that?)

Michael Stephens and Maria Collins’ Web 2.0, Library 2.0, and the Hyperlinked Library is a fascinating for those interested in learning more about these concepts. Certainly, at the core of Library 2.0 is blogs, RSS, podcasting, wikis, IM, and social networking sites. But it’s much more than that, and Stephens and Collins boils it down nicely to four main themes of Library 2.0:

(1) Conversations – The library shares plans and procedures for feedback and then responses. Transparency is real and personal.

(2) Community and Participation –
Users are involved in planning library services, evaluating those services, and suggest improvements.

(3) Experience – Satisfying to the user, Library 2.0 is about learning, discovery, and entertainment. Bans on technology and the stereotypical “shushing” are replaced by a collaborative and flexible space for new initiatives and creativity.

(4) Sharing – Providing ways for users to share as much or as little of themselves as they like, users are encourage to participate via online communities and connect virtually with the library.

Thursday, April 17, 2008

The Year Is 2009...

We're not that far. . . In 2002, Paul Ford wrote an amazing piece predicting what the world would look like in 2009. Well, we're almost there. Ford thought about a "Semantic Web scenario," one which had a short feature from a business magazine published in 2009. While Amazon and Ebay both worked as virtual marketplaces (they outsourced as much inventory as possible) by bringing together buyers and sellers while taking a cut of every transaction, Google focused on the emerging Semantic Web.

This is how Ford explains the SemWeb, which is one of the most concise I've seen to date.

So what's the Semantic Web? At its heart, it's just a way to describe things in a way that a computer can “understand.” Of course, what's going on is not understanding, but logic, like you learn in high school:

If A is a friend of B, then B is a friend of A.

Jim has a friend named Paul.

Therefore, Paul has a friend named Jim.

Jim has a friend named Paul.

Therefore, Paul has a friend named Jim.


Of course, it's much more than just A's and B's. But the idea that Google will eventually integrate the SemWeb into its applications is exciting. And for an article that was written back in 2002 with such clarity, it's a highly engaging read.

Saturday, April 12, 2008

Google and Web 3.0?

Maybe Google gets it afterall. Google has made its foray into the Semweb with its new Social Graph API coding. What's that? And why should you care? In having the Social Graph API, it makes information about the public connections between people on the Web, expressed by XFN and FOAF markup and other publicly declared connections, easily available and useful for developers. The public web is made up of linked pages that represent both documents and people. Google Search helps make this information more accessible and useful.

In other words, if you take away the documents, you're left with the connections between people. Information about the public connections between people is really useful. A user might want to see who else you're connected to, and as a developer of social applications, you can provide better features for your users if you know who their public friends are. There hasn't been a good way to access this information.

The Social Graph API looks for two types of publicly declared connections:

  1. It looks for all public URLs that belong to you and are interconnected. This could be a blog, Facebook, and a Twitter account.
  2. It looks for publicly declared connections between people. For example, your blog may link to someone else's blog while your Facebook and Twitter are linked to each other.

This index of connections enables developers to build many applications including the ability to help users connect to their public friends more easily. Google is taking the resulting data and making it available to third parties, who can build this into their applications (including their Google Open Social applications). Of course, the problem is that few people use FOAF and XFN to declare their relationships, but Google's new API could make them more visible and social applications could use them. Ultimately, Google could also index the relationships from social networks if people are comfortable with that.

What does this mean for information professionals? Stay tuned. By having Google on board the Semweb train, (or ship), it could pave the way for more bricks to be laid on the road to realizing the goal of differentiating Paris from Paris.

Wednesday, April 09, 2008

7 Things You Need to Know about the Semantic Web

Over at Read/Write Web, Alex Iskold has come up with what I consider a seminal piece in the Semantic Web literature. In Semantic Web Patterns: A Guide to Semantic Technologies, Iskold synthesizes the main concepts of the Semantic Web, asserting that it offers improved information discoverability, automation of complex searches, and innovative web browsing. Here’re the main themes:

(1) Bottom-Up vs. Top-Down – Do we focus on annotating information in pages (using RDF) so that it is machine-readable in top-down fashion? Or do focus on leveraging information in existing web pages so that they meaning can be derived automatically (folksonomies) in a botton-up approach? Time will tell.

(2) Annotation Technologies – RDF, Microformats, and Meta Headers. The more annotations there are in web pages, the more standards are implemented, and the more discoverable and powerful information becomes.

(3) Consumer and Enterprise – People currently don’t care much for the Semantic Web because all they look for is utility and usefulness. Until an application can be deemed a “killer application,” we continue to wait.

(4) Semantic APIs – Unlike Web 2.0 APIs which are coding used to mash up existing services, Semantic APIs take as an input unstructured information and relationships to find entities and relationships. Think of them as mini natural language processing tools. Take a look.

(5) Search Technologies – The sobering fact is that it’s a growing realization that understanding semantics wont’ be sufficient to build a better search engine. Google does a fairly good job at finding us the capital city of Canada, so why do we need to go any further?

(6) Contextual Technologies - Contextual navigation does not improve search, but rather short cuts it. It takes more guessing out of the equation. That's where the Semweb will overtake Google.

(7) Semantic Databases – The challenge of keeping up with the world is common to all database approaches, which are effectively information silos. That’s where semantic databases come in, as focus on annotating web information to be more structured. Take a look at Freebase.

As librarians and information professionals, we gather, organize, and disseminate. The challenge will be to do this as information is exploding at an unprecedented rate in human history, all the while trying to stay afloat and explaining to our users the technology. Feels like walking on water, don’t you agree?

Tuesday, April 08, 2008

Semantic Librarianship

If I had my stocks for Web 3.0, where would I put them?

How about a neat web service called Freebase. It’s a semanticized version of Wikipedia. But with a bigger potential. Much bigger. Freebase is said to be an open shared database of the world's knowledge, and a massive, collaboratively-edited database of cross-linked data. Until recently accessible by invitation only, this application is now open to the public as a semi-trial service.

What does this have to do with librarians? As Freebase argues, “Wikipedia and Freebase both appeal to people who love to use and organize information.” Hold that though. That’s enough to whet our information organizational appetites.

In our article, Dean and I argued that the essence of the Semantic Web is the ability to differentiate entities that the current Web is unable to do. For example, how can we currently parse Paris from Paris? Although still in its initial stages with improvements to come, Freebase does a nice job to a certain extent. Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC, it contains structured information on many popular topics, like movies, music, people and locations—all reconciled and freely available via an open API.

As a result, Freebase builds on the Social Web 2.0 layer, while providing the Semantic Web infrastructure through RDF technology. For example, Paris Hilton would appear in a movie database as an actress, a music database as a singer and a model database as a model. In Freebase, there is only one topic for Paris Hilton, with all three facets of her public persona brought together. The unified topic acts as an information hub, making it easy to find and contribute information about her.

While information in Freebase appears to be structured much like a conventional database, it’s actually built on a system that allows any user to contribute to the schemas—or frameworks—that hold the data - RDF, as I had mentioned. This wiki-like approach to structuring information lets many people organize the database without formal, centralized planning. And it lets subject experts who don’t have database expertise find one another, and then build and maintain the data in their domain of interest. As librarians, we have a place in all of this. It's out there. Waiting for us.

Wednesday, April 02, 2008

Moving Out & Moving On

Everyone needs a change every now and again. On May 1st, 2008, I will be moving to the Irving K. Barber Learning Centre as Program Services Librarian. Having worked with some very talented and supportive colleagues, I feel supremely fortunate because without them, I would not be at where I am at this point of my career.

Over the past few years, I have enjoyed working in a variety of jobs, from public libraries, to hospital libraries, to research centres, to academic libraries. (I also dabbled in publishing, archival, as well as teaching ventures). The integration of these experiences has been wonderful as it has helped build skills most essential in my upcoming endeavours.

What will this new position entail? To a certain extent, everything that I'm not doing now as an academic librarian. The Irving K. Barber Learning Centre itself is not a "traditional" library. It's a new building, a space for collaborative learning and ideas. A learning commons. A new way of learning. It also represents a new direction for librarianship. If there is one thing that typifies this position, it would be digital outreach. Web 2.0, Semantic Web, and Web 3.0? Stay tuned.

The possibilities are exciting.

I'd like to thank everyone who helped me along the way, particularly Dean Giustini, Eugene Barsky, Eleanor Yuen, Tricia Yu, May Yan, Henry Yu, Hayne Wai, Chris Lee, Rob Ho, Peter James & friends at HSSD, Rex Turgano, Rob Stibravy, Susie Stephenson, Matthew Queree, and Angelina Dawes, among the many. And of course, Hoyu. Thank you to all.