Tuesday, February 04, 2020

Text Analysis: A Hermeneutical Exercise



I'll be teaching a short intro workshop on text analysis using Voyant, an open-source, web-based application.   Geoffrey Rockwell (Professor of Philosophy and Humanities Computing at the University of Alberta, Canada) and Stéfan Sinclair (Associate Professor of Digital Humanities at McGill University) developed the application to support scholarly reading and interpretation of texts or corpus, particularly by scholars in the digital humanities.   I've been reading their text, Hermeneutica: Computer-Assisted Interpretation in the Humanities, to brush up on my knowledge in the teaching of the session to get to teach Using Voyant and the NLTK for Text Analysis.

 This video is part of the #dariahTeach platform (http://teach.dariah.eu), an open-source, community-driven platform for teaching and training materials for the digital arts and humanities.  As part of the course Introduction to Digital Humanities and the series Digital Humanities in Practice, this video discusses text visualization in Digital Humanities, emphasising that visualisation is not the end product but an intellectual process of thinking and interpreting text.

In their book in Hermeneutica, Rockwell and Sinclair suggest:
"In the slippage between our literary notion of a text and the computer's literal processing lie the disappointment and the possibility of text analysis.  Computers cannot understand a text for us.  They can, however, do things that may surprise us."  

Wednesday, October 30, 2019

Using Palladio and Gephi as Data Visualization Tools

Much has been published about data visualization tools.  Miriam Posner has written in this area which I often use as a reference.   Some have even commented on the variations and differences of Gephi and Palladio

Over the last year, I've been using Palladio to examine datasets of the Chinese headtax project, which makes it easy to create bivariate network graphs to illustrate relationships between two dimensions. By default, Palladio creates a force-directed layout, which is different from Gephi.   Palladio, at the same time, is only limited to this layout. The platform has no way of doing computational or algorithmic analysis of your graphs; you will need a more powerful program like Gephi to do that work.  The most powerful method for creating networks come from programming languages such as R, Python, and Javascript. These languages allow you to control various algorithmic and aesthetic aspects of network visualizations.  Any dimension of the data can be used as the source and target of a graph.

Regardless, I still find that knowing a bit of each of the data visualization tools would be helpful for any researcher, in any phase of their research process and lifecycle.   The following video tutorials is what helps me keep myself informed about not only how to use the tools, but also weighing the strengths and weaknesses of a particular approach to playing around with the data.  I'd be interested in hearing how you approach your data.  How do you learn the tools of your trade and then decide which would be the best for your own analyses? 





Thursday, October 10, 2019

Was Shakespeare Really Shakespeare? "Shakespeare has now fully entered the era of Big Data."

Is Shakespeare really Shakespeare?  This is a question I pose whenever I'm asked about what is digital humanities.  In Shakespeare Beyond Doubt: Evidence, Argument, Controversy, two chapters are devoted to application of stylometry to Shakespeare's works and goes into much detail.   "Authorship and the evidence of stylometrics" by MacDonald Jackson and "What does textual evidence reveal about the author?" by James Mardock and Eric Rasmussen discuss an interesting aspect of these studies is that computer models using different algorithms come to similar conclusions as scholars from the "analog" era.

In 2013, The New Oxford Shakespeare made ripples in the literary world credited Christopher Marlowe as a co-author of Shakespeare’s “Henry VI,” Parts 1, 2, and 3.  Now, I've along with many throughout our literary studies have been told that there's an inevitable Marlowe-Shakespeare connection, but it isn't until more recently that scholars using distant reading techniques have used computer-aided analysis of linguistic patterns across databases to further this argument, and as Gary Taylor proposes that "Shakespeare has now fully entered the era of Big Data."   Daniel Pellock-Pelzner points out that writing a play in the sixteenth century was a bit like writing a screenplay today, with many hands revising a company’s product.   The difference is that scholars from the New Oxford Shakespeare reduces the long-held hypothesis since the Victorian era that algorithms can truly tease out the work of individual hands. 

I'm really fascinated to continue exploring this facet of literary studies, and I'm just at the beginning of my own journey.  I'm currently working on data in the sense of using R programming (which is also used in stylometry) to study the early Chinese migrants coming to Canada, and studying the data to discern patterns of migration and kinship networks.   Certainly, dipping into the literary and the historical analysis is very much in the spirit of DH. 


Thursday, June 20, 2019

Mining Register of Headtax Records using R and Palladio

In 2009, I began working with researchers and librarians UBC Library and SFU Library on a project that sought to collect and digitize materials from Chinese Canadian organizations across Canada.   That project ended in 2012 when funding from the federal government was completed.   Recently, Sarah Zhang and I began examining the 97,123 migrants who arrived in Canada between 1886 to 1949 that was painstakingly transformed to a Microsoft Excel spreadsheet but has been largely untouched for the most part by researchers other than a few research papers.

Between 1885 and 1923, the Canadian government imposed a head tax on Chinese immigrants entering Canada in order to restrict immigration. While a print register was created to keep track of the influx of migrants, these detailed recordings have actually provided researchers and historians with years of demographic information about the immigrants and have become a rich source of data for researchers. Thanks to two scholars, Peter Ward and Henry Yu, and their teams at the History Department of the University of British Columbia, the Register of Chinese Immigrants to Canada (1886-1949) has been transformed to a digital spreadsheet, openly accessible from UBC Open Collection, and a searchable database accessible from Library and Archives Canada.

The main challenge of this headtax project from its inception is that as an impressively large-scale dataset, the records are for the most part incoherent as they show idiosyncratic dialects of the immigrants which result in variations of place names and titles. The inconsistencies in place names, unfortunately, lead to difficulties for anyone who wishes to exercise any analysis associated with the immigrants’ origins. In other words, while there is a treasure trove of data to use, it may be unusable for most unless there can be data manipulation that can unlock a better understanding of the missing gaps.  In other words, not much sense could be made of the data even though it was readily available.

https://osf.io/9zr6f/


To address these inconsistencies, in 2008 Eleanor Yuen from the UBC Asian Library initiated a project to normalize various transliterations of the immigrants’ origins and had laid the groundwork for more in-depth research for future researchers. The immigrants’ origins are represented at two hierarchical levels: county and villages/towns; there are eight counties and numerous villages in the registry. Of the eight counties, the names of villages/towns in three counties have been mapped: Sun Woy (now knownas Xinhui), Zhongshan, and Taishan. Although just a snippet of the records, this normalized data offers a true glimpse into the full impact of what is available in the research.

Since the completion of the digitization work, scholarship has drawn on the digital records from the project, manifesting differing methods and research findings. W. Peter Ward’s publication in 2013focused on the changes on the wellbeing of Chinese headtax immigrants, particularly analyzing the immigrants’ stature, a statistical indicator for wellbeing. He contrasted mean height by age of different age cohorts (one decade apart), and found a rising trend in stature over time: “a slow but significant increase in stature within the immigrant population from the middle of the 19th century to the early years of the Sino-Japanese War."  This increase in height, Ward speculated, can be attributed to the migration process itself.

In terms of methodology, Sarah and I felt that the previous studies discussed above haven’t yet demonstrated the potential of a great variety of computational tools, such as R, a statistical computational language, and Palladio, a network analysis tool developed by the Humanities + Design Lab at Stanford University.   We decided to continue with the research by building some datasets and opening up our discoveries in the Open Science Framework with intentions that our study can demonstrate and share the untapped potential of the head tax data while also providing testimony for new modes that librarians help shape digital scholarship and create promising new research questions for researchers.   Stay tuned for more!  In the meantime, please download the data and try it out!

Friday, May 31, 2019

Supporting Diversity, Equity, and Inclusion in Our Canadian Libraries - Reflections From The Last Decade


I recently presented at the Saskatchewan Libraries Association (SLA) 2019 with my colleagues Maha Kumaran and Jian Wang.   It was a self-reflective exercise, to distill a decade's worth of professional work as an academic librarian.   Perhaps Miu Chung Yan, a social work scholar puts it best when he asserts that a profession such as social work has its roots deeply embedded in colonialist origins, with a history steeped in British methodologies and history. Librarianship offers similar comparisons as it is deeply influenced by British and Anglo-American thinkers and practitioners.  As far back as 1946, Sidney Ditzion had already proposed that since America drew much of its cultural influence from the European continent, it is not surprising that librarianship should be one of them. To understand the bridge between librarianship and cultural diversity, one also needs to understand that the phenomenon is intrinsically tied to society as much as the profession. ALA leaders constituted an elite corps of Western Anglo-Saxon Protestants (WASP) – mostly male, middle-class professionals immersed in the disciplinary and literary canons of the dominant culture and had shared a common ideology. However, when a profession lacks diversity, it, unfortunately, loses relevance for many of its users.  Libraries are a microcosm of society, and if libraries are not a reflection of our society, then there is a real cause for concern. 
As a librarian earning his stripes in a profession steeped in tradition and unwritten rules, it feels overwhelming at times. But I survived, and although still on my journey as a visible minority librarian, I have found some strategies that have worked for me in coping and performing at a high level as a professional librarian. Not only is being connected to fellow colleagues critical, but one must have commitment to his own professional and personal development at all times. Keeping abreast of technical knowledge and other developments in the field of librarianship is important, but equally vital is the soft skills such as interpersonal relations, confidence, and a positive mindset. As the oft-quoted screenwriter Melissa Rosenberg puts it, “It doesn't matter if you're the smartest person in the room: If you're not someone who people want to be around, you won't get far.”
I've written and published some of these strategies in Aboriginal and Visible Minority Librarians: Oral Histories From Canada, and shared some of these thoughts and reflections at SLA 2019.   I've been a part of VIMLOC for a number of years now, and I'm encouraged and proud to see how far it's come, but also how much more it needs to go to truly make an impact in Canadian libraries (and beyond).  Is it enough?   What do we need more to help us do more?   I encourage us all as librarians to think more broadly about our place in not only the profession but also in society: how we do help shape the future that is so highly influenced from the past?   How do we instill change, even though we are powerless in our own ways?   I challenge each and every one of you to start making a positive contribution by changing our perceptions of the status quo.

Thursday, April 11, 2019

The Emotional Fatigue of Unseen Labour in Librarianship

Photo by Fahrul Azmi on Unsplash
Recently, there was an article that had ripples across the academic community at UBC.  Although it's not a revelation that racialized faculty dedicate a huge amount of time, energy and passion for helping students of colour who struggle, often much more than their white colleagues, the work is often invisible, or unseen, labour.   In my years as a professional librarian, I have consulted with a fair share of students of colour who either want to enter the profession or already studying in a graduate program.   Although thankfully, there is no need for me to offer boxes of kleenexes in these meetings, my long conversations with students often do veer into serious confessions of identity, self-doubt, and then, experiences of discrimination.

As a Canadian-born Chinese (CBC), I have personally experienced and seen some of the barriers that visible minority librarians face entering the library profession. Like most ethnic minority librarians, I have faced challenges of misperceptions and biases that are attached to librarians of colour, and like most, I strive to as professional as possible in dealing with and learning from cultural barriers in the workplace. Oftentimes, I have heard from mentors and colleagues that librarians such as myself need to be more outgoing and sociable or to “break out of the shell” and engage them more. Research studies have supported these conceptions of certain Asian groups as a “model minority” with labels “conservative,” and “lacking in interpersonal skills."  

In fact, one case study found that supervisors can evaluate the performances differently for different ethnic groups because of preconceived biases.  This is especially problematic as librarianship is a social profession.  Without opportunities for social expression, the career of an individual is at a severe disadvantage.   But that's just the way things are, and librarians such as myself do our best to listen and empathize with visible minorities breaking into the profession. It's emotional labour, and physical toll on one's psyche when hearing stories of not racial or gender discrimination.  It demands time and creates emotional fatigue. I often come out of it tearing up on the inside, but remaining calm on the outside.  It's important work, unpaid and unrecognized, but work I am proud to do on my own time if it helps another individual and advances my profession in the future.  

Thanks to some great mentors and relationships with colleagues, I have for the most part experienced positive and rewarding experiences as a librarian, but it has not been without its rocky moments. Perfecting the craft of reference work, collection development techniques, and best practices for information literacy instruction classes is challenging as it is with vast amounts of time and dedication required, but in addition to that, visible minority librarians must also learn the nuances of fitting into a particular organizational culture firmly while still feeling comfortable in one’s own skin. I've written about this in the past and will be sharing my thoughts and research at the Saskatchewan Libraries Association in May 2019 One of the proudest initiatives that I'll be talking about is one that I've been a part for many years, the Visible Minority Librarians Network of Canada (ViMLoC), which offers advice and guidance to visible minority librarians in the areas of education, training, and mentorship.   The panel will also be examining the Census of Canadian Academic Librarians of 2016 and 2018, the panel will share its view of the censuses of the two years and discuss how much we have progressed with diversity as a profession in light of the recent controversy at the ALA Midwinter in Seattle.  I look forward to reporting back.  Stay tuned.


Tuesday, April 09, 2019

The Big Academic Publishers Going Into Data Analytics Business

The latest SPARC Landscape Analysis is a fascinating read.  It's surprising to learn that not only are the so-called big-three academic publishers - Elsevier, Pearson and Cengage - are doing extremely well financially, they are keeping ahead of the curve radically transforming themselves into data analytics companies built atop their content, continuously looking at approaches to monetize its content.   It's an interesting question I often get from students who ask me about the citation manager Mendeley (owned by Elsevier), and why it's free and offers 2 gigabytes of free space of storage.

None of these companies shows any inclination to abandon its traditional content business, and for sound reasons.   These publishers continue to use data and data analytics services to their customers, not content to just growing their traditional core business.   Why should we as academics care?  Well, the move by publishers into the core research and teaching missions of colleges and universities, with tools aimed at evaluating productivity and performance, means that the academic community could lose control over vast areas of its core activities.   While Elsevier is the example, it could be followed by any of the other big publishers.  Here's the type of influence that publishers have:

(1) Research Prediction - Publishers could identify, through the analysis of research and publication patterns and the quality and reach of their collaboration networks, which researchers are likely to grow into future leaders in their respective fields and offer them editorial board positions on their journals ahead of other publishers.

(2) Disciplines - They could also identify which segments of various disciplines are likely to evolve into the next growth area for research by looking (for example) at project participation patterns, size, and quality of teams, and funding bodies’ decisions, targeting these segments with new, dedicated journals ahead of other publishers.

(3) Funding -  They could isolate in advance new trends in interdisciplinary studies, allowing it to establish publication forums where none exist today and even driving funding decisions which lead to accelerated growth for those types of research.

As we can see, we are heading into uncharted territory, at least in the digital and data age.   While Elsevier and these other publishers have been duly noted for their questionable practices and growing influence in academic publishing, (for better or worse, mostly for the worst) publishers need to face more scrutiny and the types of data they offer disguised as better services.  Question is: will we listen?

Friday, December 07, 2018

Digital Library Perspectives: Volume 34 Issue 3 (Special Theme on Digital Humanities)

I'm excited to see our issue of Digital Library Perspectives published. The theme “digital humanities” (DH) – its history, major projects and practitioners, and, especially, its many definitions – has been the subject of frenzied scholarship and publications for more than 20 years.   This issue is unique in that it's one of the few LIS journals that has an entire issue devoted to the theme of DH.

This issue is a collection of papers by librarians, academic researchers, and scholars working in areas of DH, including non-Western contexts whose voices are so often left out of mainstream discussions. The papers collected in this issue present a vision of the Library as a central partner in DH scholarship; therefore, positioning the Library not just as a place to consume knowledge but as a place where new knowledge is actively co-created by researchers and librarians alike.   My colleague Megan Lobay and I hope you enjoy these pieces!

From humanities computing to the digital humanities: a literature review by Allan Cho, Megan Meredith-Lobay (pp. 154 - 161)
HTML
PDF (121 KB)

Kindles, card catalogs, and the future of libraries: a collaborative digital humanities project by Anna L. Neatrour, Elizabeth Callaway, Rebekah Cummings (pp. 162 - 187)
Keywords: Future of libraries, Digital humanities, Topic modeling, Close reading, Distant readingInterdisciplinary collaboration
Type: Research paper
Abstract
HTML
PDF (566 KB)

Back to basics: Supporting digital humanities and community collaboration using the core strength of the academic library by Shannon Lucky, Craig Harkema (pp. 188 - 199)
Keywords: Collaboration, Community, Academic libraries, Cultural heritage, Digital humanities, Digitization
Type: Research paper
Abstract
HTML
PDF (144 KB)

Respecting the language: digitizing Native American language materials by Mary Wise, Sarah R. Kostelecky (pp. 200 - 214)
Keywords: Digitization, Collaboration, Digital humanities, Digital collection, Native American language, Zuni Pueblo
Type: Case study
Abstract
HTML
PDF (257 KB)

Finding a place for genealogy and family history in the digital humanities by Casey Daniel Hoeve (pp. 215 - 226)
Keywords: Libraries, Intersectionality, Cultural analysis humanities, Genetic ancestry, Historical societies,Humanities computing
Type: Conceptual Paper
Abstract
HTML
PDF (171 KB)

Digital Korean studies: recent advances and new frontiers by Javier Cha (pp. 227 - 244)

Saturday, December 01, 2018

Excellent Opportunity -- JSTOR Digital Humanities Fellow

An excellent opportunity for those with interest in the digital humanities. ITHAKA is looking for a digital humanities practitioner and educator to drive adoption and use of JSTOR’s suite of tools, APIs and content aimed at digital scholars. The Digital Humanities Fellow will create teaching materials and teach workshops and webinars related to digitization and metadata production, text and data mining, and linked open data. Funded in part by a grant from The Andrew W. Mellon Foundation, this two-year-term position starting June 1, 2019 is ideal for the recent Digital Humanities Masters or PhD graduate seeking to apply their skills towards expanding the impact of digital scholarship. For the right candidate, this position can be held while still a student. The Digital Humanities Fellow will be a member of the innovative and collaborative JSTOR Labs team.

Responsibilities

The Digital Humanities Fellow will play a key role on the Plant Humanities Initiative, a partnership between JSTOR and Dumbarton Oaks, a research institute, museum and historic garden affiliated with Harvard University and located in Washington D.C. The Plant Humanities Initiative will pilot a new model for integrating digital humanities with scholarly programming to support the development of a new and emerging field. During the course of this project, the JSTOR Labs team will develop a new digital tool supporting plant humanities research and fellows at Dumbarton Oaks will employ this tool toward the creation of new scholarship. One of the aims of the digital tool will be connecting, contextualizing, and disseminating digitized primary sources. It will be the Digital Humanities Fellow’s responsibility to teach these fellows digital humanities skills and to support their use of the new tool. Upon completion of the tool, the Digital Humanities Fellow will assist in disseminating and gathering feedback on the digital tool through means such as presenting at appropriate conferences and contributing to a written report.

The Digital Humanities Fellow will also drive adoption and use of JSTOR’s other digital humanities tools, APIs, services and content. These services include Data For Research, a text- and data-mining service which JSTOR is currently exploring expanding in partnership with other non-profit collections-holders. JSTOR Labs has a suite of APIs to support digital humanists, including those related to Text Analyzer, its award-winning document analysis and search tool, and Understanding Great Works, a new tool for studying primary literary and historical texts. To encourage adoption and use of these tools and services, the Digital Humanities Fellow will speak at conferences, give webinars, and create instructional materials like assignments and sample datasets. He or she will inform the development of these services by being the voice of the user to the developers. Last, the Digital Humanities Fellow will co-author with other Labs members articles for scholarly and popular journals about their work.
Experience and Skills 
  • A Masters or PhD in a scientific discipline (computer science / engineering / mathematics) with deep experience in digital humanities, or a Masters or PhD in humanities with a proven expertise in digital technologies.
  • Experience with and ability to teach digital humanities methods and technologies, including:
            - Natural language processing, including topic modeling (ideally using Mallet);
            - Text and Data Mining;
            - Linked Open Data, including Wikidata and knowledge graphs
            - Data visualization
  • Commonly used text-processing and analytics languages (for example, Python and R) 
  • Content markup including XML, ePub, PDF & TEI preferred 
  • Stellar communication, collaboration and organizational skills, and the ability to learn new techniques and technologies on the job. 
  • Experience working with archival/primary source materials for research and/or teaching preferred. 
  • Experience with web application development preferred. 
  • Committed to our organizational values of belonging, evidence, speed, teamwork, and trust.
For more details and to apply, follow this link: https://recruiting.ultipro.com/ITH1000ITHAK/JobBoard/5fe90ad4-9e26-490b-9c45-6c9669d4dcd0/OpportunityDetail?opportunityId=672ff2c1-90c7-4dd6-b4f6-ab317c31640d

Friday, October 19, 2018

Digital Humanities Librarian Search at the University of Illinois at Urbana-Champaign

Digital Humanities Librarian
Assistant or Associate Professor, University Library
University of Illinois at Urbana-Champaign


Position Available:
Position available immediately. This is a 100%, twelve-month, tenure-system appointment.

The University of Illinois is an Equal Opportunity, Affirmative Action employer. Minorities, women, veterans and individuals with disabilities are encouraged to apply. For more information, visit http://go.illinois.edu/EEO. To learn more about the University’s commitment to diversity, please visit http://www.inclusiveillinois.illinois.edu.

Position Summary:
The University Library at the University of Illinois at Urbana-Champaign is seeking a creative, innovative, collaborative, and intellectually curious individual to lead digital humanities services in the Library. The successful candidate will contribute to engaged, inclusive digital scholarship across campus. We encourage applicants who are committed to the principle that a diverse community enhances our institution and who will help the University of Illinois at Urbana-Champaign achieve new levels of excellence by fostering and sustaining our diverse and inclusive academic environment.

Reporting to the Head of Scholarly Communication and Publishing, the Digital Humanities Librarian leads outreach and services for research and instruction in the humanities and arts that employs digital technologies and data. Working with colleagues in the library’s Scholarly Commons, Research Data Service, and Scholarly Communication and Publishing units, Library Information Technology, Media Commons, special collections curators and archivists, and subject liaisons, the Digital Humanities Librarian is part of a team of functional and subject experts that works with researchers on digital scholarship broadly.

The Digital Humanities Librarian typically serves as the initial point of referral for humanities and affiliated researchers as they begin digital research and teaching projects, referring to and collaborating with these other experts as necessary. Building on a history of increasing library support for digital humanities on campus, the librarian plays a key role in furthering digital humanities library services, collaborating with others to fulfil needs identified in the library’s recent Digital Humanities Needs Assessment Report and providing ongoing assessment for new and existing activities. This role includes working to further the development of the library’s Scholarly Commons as a hub for digital humanities collaboration and discussion. The librarian also collaborates with other areas of campus that provide complementary services for digital research (including the NCSA Culture and Society Initiative, an emerging design center network, and others) to provide referrals and maximize impact.

Duties and Responsibilities:

  • Provide both reference services and in-depth research consultations for faculty and students on digital humanities-related research, in collaboration with colleagues;
  • Design and deliver instruction to classes, research groups, and other audiences to further information literacy and digital literacy outcomes;
  • Acquire and manage humanities research data (e.g., text and media corpora from commercial publishers) in consultation with others;
  • Work with librarians, library IT, and researchers to evaluate digital scholarship tools and select and implement the most appropriate tools to meet specific needs;
  • Serve as the primary point of consultation for researchers with questions about text and data mining tools and approaches;
  • Work with colleagues on digital humanities-related publishing activities and scholarly communications outreach related to digital humanities research;
  • Collaborate with colleagues in the Scholarly Commons and other units on outreach activities related to digital scholarship, and on fostering the Scholarly Commons as a collaborative space for digital humanities work;
  • Assess evolving campus needs and represent the Library in campus initiatives and activities involving digital humanities related research and data science, including serving as a liaison to the Illinois Program for Research in the Humanities and other relevant initiatives;
  • Facilitate professional development opportunities for library faculty and staff, including subject specialists, in digital humanities areas relevant to their interests and responsibilities;
  • Contribute to the national and international reputation of the University Library through professional research, service, and collaboration with national colleagues, organizations, and consortia.

Qualifications:

Required:
ALA-accredited MS-LIS or equivalent, or a PhD in the humanities or humanistic social sciences (completed by start date) with two years of relevant experience with digital projects;
Understanding of traditional and emerging digital approaches to research and publication in humanities work, and of how new digital approaches can reshape research and teaching in the humanities;
Experience with research software or programming languages used within some area of digital humanities, such as text and data mining, network analysis, or multimodal publishing;
Experience teaching in workshop or classroom settings;
Excellent oral and written communications skills;
Demonstrated ability to work both independently and collaboratively with a diverse community, and manage multiple tasks effectively in a team environment;
Evidence of the ability to do research, publication, and service consonant with University standards for tenure and promotion.

Preferred:
Experience teaching technology and digital literacy concepts;
Experience authoring or collaborating on digital humanities projects in research or instruction;
Coursework or experience using a programming language, especially one commonly used in digital humanities projects such as Python or R;
Demonstrated understanding of metadata standards and data curation practices relevant to digital humanities work;
Familiarity with copyright and licensing issues related to digital projects.

Monday, October 08, 2018

Digital Humanities: Implications for Librarians, Libraries, and Librarianship in Journal of College & Undergraduate Libraries #DH

Although it's been out for almost a year now, I'm excited to have read the DH-themed issue in College & Undergraduate Libraries.  Volume 24's Digital Humanities: Implications for Librarians, Libraries, and Librarianship is a special issue that reflects some of the current challenges that occupy librarians who are engaging the academic community in the digital humanities.  Some of the authors are familiar names and that's not surprising as many are at the top of their fields in DH.

In College & Undergraduate Libraries, this special issue has thirty articles on various topics that have been organized around six main themes: theoretical and critical issues, transforming traditional collections, models of collaboration, planning and project management, the ACRL Framework for Information Literacy for Higher Education, and embedded librarian instruction.   I enjoyed all of these pieces and it is a reflection of the work that is emerging at the intersection of academic libraries and digital scholarship in the humanities and social sciences.   It's good timing as Megan Meredity-Lobay and my DH-themed issue in Digital Library Perspectives is coming out soon, too.  Stay tuned for more on that.  In the meantime, take a look at the following articles!

Sunday, August 12, 2018

Open Access, APC's, and the Pop-Up Restaurant

Recently, a Canadian economics professor ventured into the dark world of predatory publishing and got punished by his university and is now banned and suspended without pay. What's an academic to do when all he's doing is exposing the "deceptive scholarship" of fellow academics (that happens to be from his own university) seeking to advance their careers whose articles have been published for a fee in suspicious and fraudulent journals?

Predatory journals claim to be refereed but in reality, they publish articles in exchange for the payment of fees by authors.  Even Jeffrey Beall, the librarian who first coined the term and highlighted the negative aspects of predatory journals with his now eponymous Beall's List taken down due to pressure from his own university.

To be clear, scholarly publishing is a high stakes business, with tenure and promotion, skyrocketing journal subscriptions, and shrinking college budgets solidly inter-woven. The solution (and problem) proposed by librarians and academics (and publishing companies albeit begrudgingly) is to "open" up research outputs distributed online for free of cost or other barriers now popularly known as open access in order to counter the inequitable and unsustainable practice of charging institutions for ever-increasing subscription prices on scholarship that should be for the public good.   Open access is important; but we must face the problems of its outgrowth.

The unintended consequence is that that other than self-archiving them in institutional repositories, open access journals often require article processing charges paid by authors or research sponsors -- the "Gold OA path".  This has a prevalent byproduct: the predatory publisher that takes advantage of the desperate researcher whose job prospects depend on how many articles get accepted and published in journals.  As more journals have jumped on board to go open access, even wealthy publishers have gotten into the game by offering open access.  By clinging their hopes in this broken scholarly ecosystem, the situation reminds me of the "pop-up" dinner phenomenon.  This happened at the turn of the century when dining out became so exorbitantly expensive that consumers wanted to find an alternative, more sustainable ecology of gourmet.    

Pop-up restaurants have been popular since the 2000's and such diners typically make use of social media to communicate to its audiences.   No doubt the hospitality business is lucrative as such pop-up restaurants has become effective methods for young professionals to gain exposure, seeking investors, and experimenting with new culinary concepts.  It has gotten so successful that Restaurant Day takes place worldwide with even traditional restaurants participating.  The problem is that there is high turnover, with the most successful pop-up operations burning brightly, then quietly and quickly disappearing to make room for something new.   Does this sound familiar?   

This is not to say that the pop-up is a failure, nor is it a perfect analogy of scholarly publishing, but there is a similarity with how much disruption is happening in a traditional business with a mad scramble by entrepreneurs (and both traditional and predatory publishers are certainly that) to take advantage of those hungry to eat and get published.   Though some academics ignore OA altogether as tenure requires publication in a "global brand" of an exclusive journal or press and go where the money is, those who choose to publish in OA tend to do so out of altruism or academic disciplines who aren't tied to a prestigious publication.    To be sure the big luxury brands will not disappear anytime soon, and there will always be those who prefer to the Michelin-star experience over the home-cooked potluck.  

So let's return to the economist Derek Pyne's current ethical dilemma of his research: reveal the complicity of institutions and be punished or stay silent and conform to the dubious cycle.  Of course, he chose the former, and though we don't know the full context of the case, what we do know now is that one researcher's findings has provoked enough impact for it to be designated as dangerous.


Sunday, May 13, 2018

The Learning Gardens at the Chinese University of Hong Kong

During my sabbatical, Patrick Lo, Dickson Chiu, and I were fortunate to interview Louise Jones, the University Librarian of the Chinese University of Hong Kong (CUHK) for an upcoming book project.  The Chinese University of Hong Kong Library is one of the leading research libraries in East Asia, and has a significant Special Collections ranging from Shang dynasty oracle bones to modern Chinese literary archives.  With ongoing digitization initiatives, the Library makes available over 5 million digital images/objects and making content openly accessible to the local and global research community.

While CUHK Library comprises seven libraries, I had the most fun with the tour of the Learning Garden, which is an inspirational space that has won awards.  The Learning Garden is combined effort of librarians, architects, and the university community.   As far back as 2014, when the Learning Garden first opened, it introduced 3D Printing and 3D Scanning services with the idea of bringing design concepts to life.  In inspiring students to explore new interests in design and helping them to bring their creations and design concepts to life, the library provides two desktop level of 3D printers, including the Structure Sensor and Next Engine 3D scanners.

Clearly, with the trend in academic libraries is the shifting relationship between space and collections, physical collections increasingly across academia are being moved to storage or lesser used facilities, freeing up of space for collaborative learning and study spaces.   With specific spaces opened for 24 hours a day during term time, how does the Learning Garden provides flexible seating and facilities to support teaching and learning activities.  In addition to face-to-face teaching, collaborative learning through discussion among peers to generate ideas is increasingly important; in addition, with the needs of group projects for coursework, a 24-hour library space is necessary for the needs of students, particularly those who live on campus.   On average, the Learning Garden has up to 300 patrons just after the library closes at 10.00pm.



How does the Learning Garden differ from the Research Commons in the University Library though?  Both the Learning Garden and Research Commons are open spaces for students, undergraduate graduate students.  While the same number of group study rooms are also available for bookings at Research Commons on the first floor of the University Library complex, the services arranged by the Research Commons librarian are specifically focused for graduate and postgraduate students as well as researchers; such services include research consultation services, research café events, thesis writing skills, authoring workshop, and citation management.  However, the services for Postgraduates are not restricted to the Research Commons area; for example, the Research Cafe (presentations by Ph.D. candidates) are held in a small open space on the ground floor of the library. Activities conducted in open spaces aim to cultivate a learning ambiance and scholarly exchange and dialogue between students and scholars.

What's interesting about the Learning Garden is the emphasis of short-term flexibility with such things as movable furniture and temporary wall partitions.  The Learning Garden is an open plan design, uniquely with no temporary wall partitions; thus, its events are conducted in an entirely open setting. As its name conveys as a "garden," patrons can find their own favourite spot for individual study, for chatting, for relaxation in the refreshment zone, having group discussions or joining a talk in the Open Forum. Except for the bubble group study room, all of these activities co-exist in the Learning Garden in a harmonious, collaborative manner. The Learning Path is designed as two 50 meters long desks and S-shape curve creates natural bays for groups to study together or individual. For flexibility, the designer architect selected more light-weighted and robust furniture for students to freely move and group various tables together to fit their purpose of learning.  Smart whiteboards are movable as well.

The library is an important selling point for the university, and the Learning Garden has fast become the major attraction to University guests, students and visiting scholars.  As a building, it has won several awards.  The large whiteboard are full of students’ comments, drawing, traces of idea exchange and even poems, the students have indeed made the place their home and welcoming in their own way.  Here's the past UL Colin Storey's recap of how it all started in 2012.  


Friday, March 16, 2018

Data Analysis Using Gephi, a Digital Humanities Case Study

Chinese Canadian Stories – Uncommon Histories from a Common Past was a collaborative project that I was a part of during an earlier part of my career as a librarian, and one I'm re-visiting again in the context of digital humanities.   Interestingly, when we began, we had no idea of the term DH.  I was more involved in the community engagement aspect of the project (which is also an important ingredient in DH projects).  Between 2006-2008, a team of student researchers at UBC working with Prof. Peter Ward and Prof. Henry Yu who spent two years painstakingly recording the data for every one of the over 97,000 Chinese in the Chinese Head Tax Register.  For each of the Chinese who entered Canada, the data included names, age, height, villages and counties of origin and through a digital database, the project enabled us a powerful research tool for understanding who these migrants were, where they left, and where they were going in Canada.  The irony is that the practice of restricting immigration actually left researchers a rich data collection of those early Canadian migrants.

While the project collaborated with the local Chinese Canadian community to preserve their culture and history through outreach and actively collecting materials for its web portal, an unintended yet innovative result was the emergence of digital tools and techniques normally used in the sciences enabled us to examine the records of the migrants. Through the project, the researchers published a few peer-reviewed research papers documenting their use of Gephi, a visualization network analysis tool, which plotted locations in Saskatchewan based on longitude, latitude, and a tool called Ego Network, which it allows us to select any node in the network and filter the network to only see its connections.



In Gephi, to have a high betweenness centrality score would mean that you are integral in connecting elements within the network. In the Saskatchewan network produced by Gephi, three major destinations emerge: Saskatoon, Regina, and Moose Jaw. Swift Current, while a major destination, does not have as many links to highly-connected places as Saskatoon, Regina, and Moose Jaw (which predictably are very connected to each other). Some of the major families in this network also start to emerge: there was a strong Ma clan association that had through chain migration spread across Saskatchewan.  Combined with oral histories and analog research, this method of DH inquiry is a supremely powerful way to enhance discover and to visually tell the story of our findings. 


The Ma family appears to be much more important in the Regina network than in the Moose Jaw network. In order to produce these networks using Gephi, the researchers combed through all the immigrants that listed Saskatchewan as their destination but had to deduce the Romanized form of their surnames. There are quite a few Romanizations that have multiple possible Chinese surnames associated with Ma.  For instance, in anayzing the Regina network, it becomes clear that the Luo and Liu who are from Yuemingcun in Sen Ning are probably the same family--either Luo or Liu, not both.   My colleagues at Asian Library, including the now retired Eleanor Yuen, have pioneered the way for future research by mapping the villages and towns recorded in the Register of Chinese Immigration to Canada from 1885 to 1949 in their original Chinese character names. 
The project was furthered with the great help at the Spatial History Project, Center for Spatial and Textual Analysis [CESTA] at Stanford University.

Using the variables of family name, village origin, and destination in Saskatchewan, Stanford researcher Stephanie Chan used Gephi to produce network patterns for four Chinese family lineages that visualize the weighted correspondence of family name and village origin in creating family chains and connection between destinations. Of course, this preliminary visualization is limited to only describing the Ma family in Saskatchewan.   There's still much data to be analyzed.   The work has just begun. 
Historical Chinese Language Materials in British Columbia (HCLMBC) was a collaboration between UBC and SFU to digitize historical records and images related to Chinese settlement and life in British Columbia.

Sources for more reading:

Yu, Henry, and Stephanie Chan. " The Cantonese Pacific: Migration Networks and Mobility Across Space and Time." Trans-Pacific Mobilities: The Chinese and Canada (2017): 25. [Link]

Hermansen, S. and H. Yu. “The Irony of Discrimination: Mapping Historical Migration Using Chinese Head Tax Data.” In Historical GIS Research in Canada, J. Bonnel and M. Fortin (Eds.), University of Calgary Press, 2014. [Link]

 Murphy, Nathan. "Review of a Digital History Tool: Gephi–Networking through History."

Calma, Angelito, and Martin Davies. 2017. Geographies of influence: A citation network analysis of higher education 1972–2014. Scientometrics 110 (3): 1579-99. [Link]

More Networks in the Humanities or Did books have DNA? [Link]
Stanford Gephi Workshop materials. [Link]

Monday, January 08, 2018

DH Projects in East Asian Studies


The ‘Digital Humanities’ is still a young and highly contested area.  Furthermore, as Tom Mullaney has argued, within Digital Humanities is an “Asia deficit”which is no small part the outcome of more entrenched divides within the platforms and digital tools that form the foundation of DH itself.   This divide between East and West runs very deep, and is not primarily a question of scholarly interest or orientation.  I was pleasantly impressed at the progress made in DH learning more about these projects.

A couple of projects that I had come across recently came from a presentation by Michael Hunter of Yale University.  He introduced the The Life of the Buddha (LOTB) project which addresses this challenge by presenting and analyzing for the first time monumental Tibetan murals depicting the Buddha’s life, their related literature, and their architectural and historical settings. LOTB also offers scholarly and learning communities the first tool to research and engage image, text, architecture, and history as an integrated and meaning-rich whole. The project’s impact for the humanities and the study of Buddhism are thus twofold: the largest study to date on visual and textual Buddha narratives in Tibet, and a new digital tool for synthetic teaching and research of Buddhist images and texts in context.  These murals date from the first decades of the 17th century and are among only a handful of fully preserved narrative paintings in Central Tibet. They are also among the few murals in Tibet explicitly linked to an extant collection of narrative, poetic, ritual, and technical painting literature about the Buddha. Practically nothing has been written about the Jonang murals, and no complete visual documentation has ever been attempted.

The Ten Thousand Rooms Project (廣廈千萬間項目) is a project led by Michael Hunter, and is a collaborative workspace (but not a database) for pre-modern textual studies.  Building on the Mirador Viewer developed by Stanford University, the platform allows users to upload images of manuscript, print, inscriptional, and other sources and then organize projects around their transcription, translation, and/or annotation. Both as a workspace for crowd-sourcing core textual research and as a publishing venue for scholarly contributions that are less well suited to conventional book formats, the Ten Thousand Rooms Project is really one of the early DH projects at Yale that establishes an international online community committed to making the East Asian textual heritage more accessible to a wider audience. All users are free to view projects on the site, and registered users can create their own projects and also to others as well. 

In all, the future of DH in Asian Studies is coming along now, certainly at a pace that suggest much is happening, either at conferences, digital podcasts, and the network of scholars and practitioners coming together in a vibrant community of practice in an area of scholarship that's long overlooked.