51,884 research outputs found

    All the World's a (Hyper)Graph: A Data Drama

    Get PDF
    We introduce Hyperbard, a dataset of diverse relational data representationsderived from Shakespeare's plays. Our representations range from simple graphscapturing character co-occurrence in single scenes to hypergraphs encodingcomplex communication settings and character contributions as hyperedges withedge-specific node weights. By making multiple intuitive representationsreadily available for experimentation, we facilitate rigorous representationrobustness checks in graph learning, graph mining, and network analysis,highlighting the advantages and drawbacks of specific representations.Leveraging the data released in Hyperbard, we demonstrate that many solutionsto popular graph mining problems are highly dependent on the representationchoice, thus calling current graph curation practices into question. As anhomage to our data source, and asserting that science can also be art, wepresent all our points in the form of a play.<br

    A Relevancy Algorithm for Curating Earth Science Data Around Phenomenon

    Get PDF
    Earth science data are being collected for various science needs and applications, processed using different algorithms at multiple resolutions and coverages, and then archived at different archiving centers for distribution and stewardship causing difficulty in data discovery. Curation, which typically occurs in museums, art galleries, and libraries, is traditionally defined as the process of collecting and organizing information around a common subject matter or a topic of interest. Curating data sets around topics or areas of interest addresses some of the data discovery needs in the field of Earth science, especially for unanticipated users of data. This paper describes a methodology to automate search and selection of data around specific phenomena. Different components of the methodology including the assumptions, the process, and the relevancy ranking algorithm are described. The paper makes two unique contributions to improving data search and discovery capabilities. First, the paper describes a novel methodology developed for automatically curating data around a topic using Earthscience metadata records. Second, the methodology has been implemented as a standalone web service that is utilized to augment search and usability of data in a variety of tools

    Emerging technologies revolutionise insect ecology and monitoring

    Get PDF
    Insects are the most diverse group of animals on Earth, but their small size and high diversity have always made them challenging to study. Recent technologi- cal advances have the potential to revolutionise insect ecology and monitoring. We describe the state of the art of four technologies (computer vision, acoustic monitoring, radar, and molecular methods), and assess their advantages, current limitations, and future potential. We discuss how these technologies can adhere to modern standards of data curation and transparency, their implications for citizen science, and their potential for integration among different monitoring programmes and technologies. We argue that they provide unprecedented possibilities for insect ecology and monitoring, but it will be important to foster international standards via collaborationpublishedVersio

    A Framework for Interactive Geospatial Map Cleaning using GPS Trajectories

    Get PDF
    abstract: A volunteered geographic information system, e.g., OpenStreetMap (OSM), collects data from volunteers to generate geospatial maps. To keep the map consistent, volunteers are expected to perform the tedious task of updating the underlying geospatial data at regular intervals. Such a map curation step takes time and considerable human effort. In this thesis, we propose a framework that improves the process of updating geospatial maps by automatically identifying road changes from user-generated GPS traces. Since GPS traces can be sparse and noisy, the proposed framework validates the map changes with the users before propagating them to a publishable version of the map. The proposed framework achieves up to four times faster map matching performance than the state-of-the-art algorithms with only 0.1-0.3% accuracy loss.Dissertation/ThesisMasters Thesis Computer Science 201

    A Conceptual Model for Scholarly Research Activity

    Get PDF
    This paper presents a conceptual model for scholarly research activity, developed as part of the conceptual modelling work within the ???Preparing DARIAH??? European e-Infrastructures project. It is inspired by cultural-historical activity theory, and is expressed in terms of the CIDOC Conceptual Reference Model, extending its notion of activity so as to also account, apart from historical practice, for scholarly research planning. It is intended as a framework for structuring and analyzing the results of empirical research on scholarly practice and information requirements, encompassing the full research lifecycle of information work and involving both primary evidence and scholarly objects; also, as a framework for producing clear and pertinent information requirements, and specifications of digital infrastructures, tools and services for scholarly research. We plan to use the model to tag interview transcripts from an empirical study on scholarly information work, and thus validate its soundness and fitness for purpose

    In Homage of Change

    Get PDF

    KAPTUR: exploring the nature of visual arts research data and its effective management.

    Get PDF
    KAPTUR (2011-2013), funded by JISC and led by the Visual Arts Data Service (VADS), is a highly collaborative project involving four institutional partners: the Glasgow School of Arts; Goldsmiths, University of London; University for the Creative Arts; and the University of the Arts London. The preservation and publication of research data is seen as positive and all UK Research Councils now require it as a condition of funding (RCUK 2012). As a result a network of data repositories are emerging (DataCite 2012a), some funded by Research Councils, others by institutions themselves. However, research data management practice within the visual arts appears ad hoc. None of the specialist arts institutions within the UK has implemented research data management policies (DCC 2011a), nor established research data management systems. KAPTUR seeks to investigate the nature of visual arts research data, making recommendations for its effective management; develop a model of best practice applicable to both specialist arts institutions and arts departments in multidisciplinary institutions; and apply, test and refine the model with the four institutional partners. This paper will explore the nature of visual arts research data and how effective data management can ensure its long term usage, curation and preservation

    Redefining the performing arts archive

    Get PDF
    This paper investigates representations of performance and the role of the archive. Notions of record and archive are critically investigated, raising questions about applying traditional archival definitions to the performing arts. Defining the nature of performances is at the root of all difficulties regarding their representation. Performances are live events, so for many people the idea of recording them for posterity is inappropriate. The challenge of creating and curating representations of an ephemeral art form are explored and performance-specific concepts of record and archive are posited. An open model of archives, encouraging multiple representations and allowing for creative reuse and reinterpretation to keep the spirit of the performance alive, is envisaged as the future of the performing arts archive

    Southern African Treatment Resistance Network (SATuRN) RegaDB HIV drug resistance and clinical management database: supporting patient management, surveillance and research in southern Africa

    Get PDF
    Substantial amounts of data have been generated from patient management and academic exercises designed to better understand the human immunodeficiency virus (HIV) epidemic and design interventions to control it. A number of specialized databases have been designed to manage huge data sets from HIV cohort, vaccine, host genomic and drug resistance studies. Besides databases from cohort studies, most of the online databases contain limited curated data and are thus sequence repositories. HIV drug resistance has been shown to have a great potential to derail the progress made thus far through antiretroviral therapy. Thus, a lot of resources have been invested in generating drug resistance data for patient management and surveillance purposes. Unfortunately, most of the data currently available relate to subtype B even though >60% of the epidemic is caused by HIV-1 subtype C. A consortium of clinicians, scientists, public health experts and policy markers working in southern Africa came together and formed a network, the Southern African Treatment and Resistance Network (SATuRN), with the aim of increasing curated HIV-1 subtype C and tuberculosis drug resistance data. This article describes the HIV-1 data curation process using the SATuRN Rega database. The data curation is a manual and time-consuming process done by clinical, laboratory and data curation specialists. Access to the highly curated data sets is through applications that are reviewed by the SATuRN executive committee. Examples of research outputs from the analysis of the curated data include trends in the level of transmitted drug resistance in South Africa, analysis of the levels of acquired resistance among patients failing therapy and factors associated with the absence of genotypic evidence of drug resistance among patients failing therapy. All these studies have been important for informing first- and second-line therapy. This database is a free password-protected open source database available on www.bioafrica.net
    • …
    corecore