7,490 research outputs found

    Leveraging High Performance Computing for Managing Large and Evolving Data Collections

    Get PDF
    The process of developing a digital collection in the context of a research project often involves a pipeline pattern during which data growth, data types, and data authenticity need to be assessed iteratively in relation to the different research steps and in the interest of archiving. Throughout a project’s lifecycle curators organize newly generated data while cleaning and integrating legacy data when it exists, and deciding what data will be preserved for the long term. Although these actions should be part of a well-oiled data management workflow, there are practical challenges in doing so if the collection is very large and heterogeneous, or is accessed by several researchers contemporaneously. There is a need for data management solutions that can help curators with efficient and on-demand analyses of their collection so that they remain well-informed about its evolving characteristics. In this paper, we describe our efforts towards developing a workflow to leverage open science High Performance Computing (HPC) resources for routinely and efficiently conducting data management tasks on large collections. We demonstrate that HPC resources and techniques can significantly reduce the time for accomplishing critical data management tasks, and enable a dynamic archiving throughout the research process. We use a large archaeological data collection with a long and complex formation history as our test case. We share our experiences in adopting open science HPC resources for large-scale data management, which entails understanding usage of the open source HPC environment and training users. These experiences can be generalized to meet the needs of other data curators working with large collections

    SIMDAT

    No full text

    Mapping the Current Landscape of Research Library Engagement with Emerging Technologies in Research and Learning: Final Report

    Get PDF
    The generation, dissemination, and analysis of digital information is a significant driver, and consequence, of technological change. As data and information stewards in physical and virtual space, research libraries are thoroughly entangled in the challenges presented by the Fourth Industrial Revolution:1 a societal shift powered not by steam or electricity, but by data, and characterized by a fusion of the physical and digital worlds.2 Organizing, structuring, preserving, and providing access to growing volumes of the digital data generated and required by research and industry will become a critically important function. As partners with the community of researchers and scholars, research libraries are also recognizing and adapting to the consequences of technological change in the practices of scholarship and scholarly communication. Technologies that have emerged or become ubiquitous within the last decade have accelerated information production and have catalyzed profound changes in the ways scholars, students, and the general public create and engage with information. The production of an unprecedented volume and diversity of digital artifacts, the proliferation of machine learning (ML) technologies,3 and the emergence of data as the “world’s most valuable resource,”4 among other trends, present compelling opportunities for research libraries to contribute in new and significant ways to the research and learning enterprise. Librarians are all too familiar with predictions of the research library’s demise in an era when researchers have so much information at their fingertips. A growing body of evidence provides a resounding counterpoint: that the skills, experience, and values of librarians, and the persistence of libraries as an institution, will become more important than ever as researchers contend with the data deluge and the ephemerality and fragility of much digital content. This report identifies strategic opportunities for research libraries to adopt and engage with emerging technologies,5 with a roughly fiveyear time horizon. It considers the ways in which research library values and professional expertise inform and shape this engagement, the ways library and library worker roles will be reconceptualized, and the implication of a range of technologies on how the library fulfills its mission. The report builds on a literature review covering the last five years of published scholarship, primarily North American information science literature, and interviews with a dozen library field experts, completed in fall 2019. It begins with a discussion of four cross-cutting opportunities that permeate many or all aspects of research library services. Next, specific opportunities are identified in each of five core research library service areas: facilitating information discovery, stewarding the scholarly and cultural record, advancing digital scholarship, furthering student learning and success, and creating learning and collaboration spaces. Each section identifies key technologies shaping user behaviors and library services, and highlights exemplary initiatives. Underlying much of the discussion in this report is the idea that “digital transformation is increasingly about change management”6 —that adoption of or engagement with emerging technologies must be part of a broader strategy for organizational change, for “moving emerging work from the periphery to the core,”7 and a broader shift in conceptualizing the research library and its services. Above all, libraries are benefitting from the ways in which emerging technologies offer opportunities to center users and move from a centralized and often siloed service model to embedded, collaborative engagement with the research and learning enterprise

    Developing a Coherent Cyberinfrastructure from Local Campus to National Facilities: Challenges and Strategies

    Get PDF
    A fundamental goal of cyberinfrastructure (CI) is the integration of computing hardware, software, and network technology, along with data, information management, and human resources to advance scholarship and research. Such integration creates opportunities for researchers, educators, and learners to share ideas, expertise, tools, and facilities in new and powerful ways that cannot be realized if each of these components is applied independently. Bridging the gap between the reality of CI today and its potential in the immediate future is critical to building a balanced CI ecosystem that can support future scholarship and research. This report summarizes the observations and recommendations from a workshop in July 2008 sponsored by the EDUCAUSE Net@EDU Campus Cyberinfrastructure Working Group (CCI) and the Coalition for Academic Scientific Computation (CASC). The invitational workshop was hosted at the University Place Conference Center on the IUPUI campus in Indianapolis. Over 50 individuals representing a cross-section of faculty, senior campus information technology leaders, national lab directors, and other CI experts attended. The workshop focused on the challenges that must be addressed to build a coherent CI from the local to the national level, and the potential opportunities that would result. Both the organizing committee and the workshop participants hope that some of the ideas, suggestions, and recommendations in this report will take hold and be implemented in the community. The goal is to create a better, more supportive, more usable CI environment in the future to advance both scholarship and research

    Big Data Management in Education Sector: an Overview

    Get PDF
    The advancement in technological innovation has given rise to a new trend known as Big Data today. Given the soaring popularity of big data technology, organisations are profoundly attracted to and interested in it to transform their organisation by improving their businesses. Big data is enabling organisations to outpace their competitors and save cost. Similarly, the application of Big Data management in Universities is an essential aspect to institutions that have Big Data to manage; as the use of Big Data in the higher education sector is increasing day by day. Many studies have been carried out on big data and analytics with little interest in its management. Big Data management is a reality that represents a set of challenges involving Big Data modeling, storage, and retrieval, analysis, and visualization for several areas in organizations. This paper introduces and contributes to the conceptual and theoretical understanding of Big Data management within higher education as it outlines its relevance to higher education institutions. It describes the opportunities this growing research area brings to higher education as well as major challenges associated with it

    Big Data Management in Education Sector: an Overview

    Get PDF
    The advancement in technological innovation has given rise to a new trend known as Big Data today. Given the soaring popularity of big data technology, organisations are profoundly attracted to and interested in it to transform their organisation by improving their businesses. Big data is enabling organisations to outpace their competitors and save cost. Similarly, the application of Big Data management in Universities is an essential aspect to institutions that have Big Data to manage; as the use of Big Data in the higher education sector is increasing day by day. Many studies have been carried out on big data and analytics with little interest in its management. Big Data management is a reality that represents a set of challenges involving Big Data modeling, storage, and retrieval, analysis, and visualization for several areas in organizations. This paper introduces and contributes to the conceptual and theoretical understanding of Big Data management within higher education as it outlines its relevance to higher education institutions. It describes the opportunities this growing research area brings to higher education as well as major challenges associated with it

    Status report on the NCRIS eResearch capability summary

    Get PDF
    Preface The period 2006 to 2014 has seen an approach to the national support of eResearch infrastructure by the Australian Government which is unprecedented. Not only has investment been at a significantly greater scale than previously, but the intent and approach has been highly innovative, shaped by a strategic approach to research support in which the critical element, the catchword, has been collaboration. The innovative directions shaped by this strategy, under the banner of the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS), have led to significant and creative initiatives and activity, seminal to new research and fields of discovery. Origin This document is a Technical Report on the Status of the NCRIS eResearch Capability. It was commissioned by the Australian Government Department of Education and Training in the second half of 2014 to examine a range of questions and issues concerning the development of this infrastructure over the period 2006-2014. The infrastructure has been built and implemented over this period following investments made by the Australian Government amounting to over $430 million, under a number of funding initiatives

    Human Computation and Convergence

    Full text link
    Humans are the most effective integrators and producers of information, directly and through the use of information-processing inventions. As these inventions become increasingly sophisticated, the substantive role of humans in processing information will tend toward capabilities that derive from our most complex cognitive processes, e.g., abstraction, creativity, and applied world knowledge. Through the advancement of human computation - methods that leverage the respective strengths of humans and machines in distributed information-processing systems - formerly discrete processes will combine synergistically into increasingly integrated and complex information processing systems. These new, collective systems will exhibit an unprecedented degree of predictive accuracy in modeling physical and techno-social processes, and may ultimately coalesce into a single unified predictive organism, with the capacity to address societies most wicked problems and achieve planetary homeostasis.Comment: Pre-publication draft of chapter. 24 pages, 3 figures; added references to page 1 and 3, and corrected typ

    Shifting to Data Savvy: The Future of Data Science In Libraries

    Get PDF
    The Data Science in Libraries Project is funded by the Institute for Museum and Library Services (IMLS) and led by Matt Burton and Liz Lyon, School of Computing & Information, University of Pittsburgh; Chris Erdmann, North Carolina State University; and Bonnie Tijerina, Data & Society. The project explores the challenges associated with implementing data science within diverse library environments by examining two specific perspectives framed as ‘the skills gap,’ i.e. where librarians are perceived to lack the technical skills to be effective in a data-rich research environment; and ‘the management gap,’ i.e. the ability of library managers to understand and value the benefits of in-house data science skills and to provide organizational and managerial support. This report primarily presents a synthesis of the discussions, findings, and reflections from an international, two-day workshop held in May 2017 in Pittsburgh, where community members participated in a program with speakers, group discussions, and activities to drill down into the challenges of successfully implementing data science in libraries. Participants came from funding organizations, academic and public libraries, nonprofits, and commercial organizations with most of the discussions focusing on academic libraries and library schools
    • 

    corecore