Search CORE

3 research outputs found

Proceedings of the 2011 Great Lakes Connections Conference : Discourse & Illumination, May 20-21, 2011, School of Information Studies, University of Wisconsin-Milwaukee

Author: Benoit III Edward A.
Publication venue: UWM Digital Commons
Publication date: 01/01/2011
Field of study

The 2011 Great Lakes Connections Conference was a conference for all Library and Information Science (LIS) doctoral students and candidates. It was a student-focused conference that was intended to provide an opportunity for LIS doctoral students to share and exchange ideas and research. The conference was open to all LIS doctoral students, and included both works in progress and full papers. The accepted papers and works in progress were selected through a double-blind review process

University of Wisconsin-Milwaukee

Improving Collection Understanding for Web Archives with Storytelling: Shining Light Into Dark and Stormy Archives

Author: Jones Shawn M.
Publication venue: ODU Digital Commons
Publication date: 01/07/2021
Field of study

Collections are the tools that people use to make sense of an ever-increasing number of archived web pages. As collections themselves grow, we need tools to make sense of them. Tools that work on the general web, like search engines, are not a good fit for these collections because search engines do not currently represent multiple document versions well. Web archive collections are vast, some containing hundreds of thousands of documents. Thousands of collections exist, many of which cover the same topic. Few collections include standardized metadata. Too many documents from too many collections with insufficient metadata makes collection understanding an expensive proposition. This dissertation establishes a five-process model to assist with web archive collection understanding. This model aims to produce a social media story – a visualization with which most web users are familiar. Each social media story contains surrogates which are summaries of individual documents. These surrogates, when presented together, summarize the topic of the story. After applying our storytelling model, they summarize the topic of a web archive collection. We develop and test a framework to select the best exemplars that represent a collection. We establish that algorithms produced from these primitives select exemplars that are otherwise undiscoverable using conventional search engine methods. We generate story metadata to improve the information scent of a story so users can understand it better. After an analysis showing that existing platforms perform poorly for web archives and a user study establishing the best surrogate type, we generate document metadata for the exemplars with machine learning. We then visualize the story and document metadata together and distribute it to satisfy the information needs of multiple personas who benefit from our model. Our tools serve as a reference implementation of our Dark and Stormy Archives storytelling model. Hypercane selects exemplars and generates story metadata. MementoEmbed generates document metadata. Raintale visualizes and distributes the story based on the story metadata and the document metadata of these exemplars. By providing understanding immediately, our stories save users the time and effort of reading thousands of documents and, most importantly, help them understand web archive collections

Old Dominion University

Network journal submission and delivery

Author
Publication venue: 'RFC Editor'
Publication date
Field of study

Crossref