2 research outputs found

    Designing Public Visualizations of Library Data

    No full text
    As in many other organizations and fields of inquiry, the data generated by libraries becomes ever more complex, and the need to communicate trends both internally and externally has also been increasing. As visualizations become increasingly embedded in library assessment and outreach, it is crucial to take into consideration the audience of the visualizations and to design visualizations that are easy to interpret. This chapter will walk readers through the process of selecting a visualization based on a particular data representation need, designing that visualization to be optimized to its specific purpose, and combining visualizations into larger narratives to engage a public audience.<p>pre-print of:<br>Zoss, Angela M. “Designing Public Visualizations of Library Data.” In <i>Data Visualization: A Guide to Visual Storytelling for Librarians</i>, edited by Lauren Magnuson. Lanham, MD: Rowman & Littlefield Publishers, Inc., forthcoming.</p

    Data quality, transparency and reproducibility in large bibliographic datasets

    No full text
    Increasingly, large bibliographic databases are hosted by dedicated teams that commit to database quality, curation, and sharing, thereby providing excellent sources of data. Some databases, such as PubMed or HathiTrust Digital Library, offer APIs and describe the steps to retrieve or process their data. Others of comparable size and importance to bibliographic scholarship, such as the ACM digital library, still forbid data mining. The additional cleaning and expansion steps required to overcome barriers to data acquisition must be reproducible and incorporated into the curation pipeline, or the use of large bibliographic databases for analysis will remain costly, time-consuming, and inconsistent. In this presentation, we will describe our efforts to create reproducible workflows to generate datasets from three large bibliographic databases: PubMed, DBLP (as a proxy for the ACM digital library), and HathiTrust. We will compare these sources of bibliographic data and address the following: initial download and setup, gap analysis, supplemental sources for data retrieval and integration. By sharing our workflows and discussing both automated and manual steps of data enhancements, we hope to encourage researchers and data providers to think about sharing the responsibility of openness, transparency and reproducibility in re-using large bibliographic database
    corecore