May 062014
 

The HathiTrust Research Center (HTRC) is pleased to announce the recipients of four Workset Creation for Scholarly Analysis (WCSA) prototyping project awards. These projects represent a range of approaches to developing new tools and techniques designed to assist researchers and scholars in 1) identifying and selecting resources from within the HathiTrust and 2) creating worksets of these resources for scholarly analysis.

Each project will receive $40,000 to develop a prototype over a nine-month period beginning in spring 2014, for a combined total of $160,000 in project funding. HTRC received 15 proposals in response to an RFP released in November, and eight finalists were invited to present projects at a shortlist meeting in February.

The following prototyping projects have been selected:

“Workset Creation through Image Analysis of Document Pages”, Texas A&M University (PI: Keith Biggers)

Biggers will work with Neal Audenaert and Natalie M. Houston to develop a software application that uses the visual characteristics of digitized printed pages to identify documents that contain three types of visually distinctive materials of interest to humanities researchers: poetry, music, and illustrations. This prototype  will demonstrate the value of using visual analysis of document images in conjunction with more traditional textual analysis to enable scholars to ask more refined questions about texts and their physical manifestations.

“Semantic Analysis of Documents from the HathiTrust Corpus”, Waikato University (PI: Annike Hinze)

Hinze’s team will develop a suite of tools that analyze documents by the semantics of their content and metadata. Clustering documents by semantic similarity will open up a wealth of opportunities for scholarly research.The project is designed in close collaboration with two humanities scholars from the areas of Maori & Pacific Studies, and Historical Anthropology, who not only drive this project with research questions based on their scholarly practice, but also provide ongoing input and feedback during the development process.

“Distributed Metadata Correction and Annotation”, Maryland Institute for Technology in the Humanities, University of Maryland. (PI: Trevor Muñoz)

Muñoz will collaborate with Peter Mallios and the Foreign Literatures in America (FLA) project team to develop a set of services and interfaces that will allow the FLA project (and other projects like it) to pull metadata records from the HathiTrust, correct and annotate these records using standardized vocabularies, gather corrections and annotations from other teams or scholars, and export enhanced metadata in formats suitable for publication as linked data.

“ElEPHãT: Early English Print in HathiTrust, a Linked Semantic Workset Prototype”, Oxford University (PI: Kevin Page)

Page will work with colleagues from the Bodleian Library to produce software that exposes the necessary metadata from individual collections for building aggregate worksets drawn from multiple sources. The prototype will build integrated worksets that combine resources from the HathiTrust and from the the Early English Books Online Text Creation Partnership (EEBO-TCP) collection, which focuses on high quality images and accurate transcriptions of items usually found in libraries’ special collections.

By awarding several prototyping projects from a variety of institutions through the WCSA project, HTRC intends to increase awareness of issues surrounding workset creation, uncover new techniques, and deliver prototypes that will enhance the value of the HathiTrust corpus. It will also foster interactions among the HTRC, developers, and researchers. “We’re excited to establish connections with new partners, and we hope the prototyping projects will lead to longer-term collaborations among participating institutions and the HTRC,”  said J. Stephen Downie, HTRC Co-Direct and WCSA PI.

WCSA is funded with a generous grant from the Andrew W. Mellon Foundation and directed by: WCSA PI and HTRC Co-Director J. Stephen Downie, Associate Dean for Research at the University of Illinois Graduate School of Library and Information Science; WCSA Co-PI and HTRC Co-Director Beth A. Plale, Professor, School of Informatics and Computing, Indiana University; and WCSA Co-PI Timothy W. Cole, Professor, University Library, University of Illinois at Urbana-Champaign. WCSA is administered in part by the Center for Informatics Research in Science and Scholarship at the University of Illinois. For more information please contact Megan Senseney.