In 2009, I began working with researchers and librarians UBC Library and SFU Library on a project that sought to collect and digitize materials from Chinese Canadian organizations across Canada. That project ended in 2012 when funding from the federal government was completed. Recently, Sarah Zhang and I began examining the
97,123 migrants who arrived in Canada between 1886 to 1949 that was painstakingly transformed to a Microsoft Excel spreadsheet but has been largely untouched for the most part by researchers other than a few research papers.
Between 1885 and 1923, the Canadian government imposed a head tax on Chinese immigrants
entering Canada in order to restrict immigration. While a print register was created to keep track
of the influx of migrants, these detailed recordings have actually provided researchers and
historians with years of demographic information about the immigrants and have become a rich source of data for researchers. Thanks to two scholars, Peter Ward and Henry Yu, and their teams at the History Department of the University of British Columbia, the Register of Chinese Immigrants to Canada (1886-1949) has been transformed to a digital spreadsheet, openly accessible from
UBC Open Collection, and a searchable database accessible from
Library and Archives Canada.
The main challenge of this headtax project from its inception is that as an impressively large-scale dataset, the records are for the most part incoherent as they show idiosyncratic dialects of the immigrants which result in
variations of place names and titles. The inconsistencies in place names, unfortunately, lead to
difficulties for anyone who wishes to exercise any analysis associated with the immigrants’ origins.
In other words, while there is a treasure trove of data to use, it may be unusable for most unless
there can be data manipulation that can unlock a better understanding of the missing gaps.
In other words, not much sense could be made of the data even though it was readily available.
To address these inconsistencies, in 2008 Eleanor Yuen from the UBC Asian Library initiated a project to normalize various
transliterations of the immigrants’ origins and had laid the groundwork for more in-depth research for future
researchers. The immigrants’ origins are represented at two hierarchical levels: county and
villages/towns; there are eight counties and numerous villages in the registry. Of the eight
counties, the names of villages/towns in three counties have been mapped:
Sun Woy (now knownas Xinhui), Zhongshan, and Taishan. Although just a snippet of the records, this normalized data
offers a true glimpse into the full impact of what is available in the research.
Since the completion of the digitization work, scholarship has drawn on the digital records from
the project, manifesting differing methods and research findings.
W. Peter Ward’s publication in 2013focused on the changes on the wellbeing of Chinese headtax immigrants, particularly analyzing
the immigrants’ stature, a statistical indicator for wellbeing. He contrasted mean height by age of
different age cohorts (one decade apart), and found a rising trend in stature over time: “
a slow but
significant increase in stature within the immigrant population from the middle of the 19th century
to the early years of the Sino-Japanese War." This increase in height, Ward speculated,
can be attributed to the migration process itself.
In terms of methodology, Sarah and I felt that the previous studies discussed above haven’t yet demonstrated the potential of a great variety of computational tools, such as
R, a statistical computational language, and
Palladio, a network analysis tool developed by the Humanities +
Design Lab at Stanford University. We decided to continue with the research by building some datasets and opening up our discoveries in the Open Science Framework with intentions that our study can demonstrate and share the
untapped potential of the head tax data while also providing testimony for new modes
that librarians help shape digital scholarship and create promising new research questions for
researchers. Stay tuned for more! In the meantime,
please download the data and try it out!