21,202 research outputs found
How to Get the Most out of Your Curation Effort
Large-scale annotation efforts typically involve several experts who may disagree with each other. We propose an approach for modeling disagreements among experts that allows providing each annotation with a confidence value (i.e., the posterior probability that it is correct). Our approach allows computing certainty-level for individual annotations, given annotator-specific parameters estimated from data. We developed two probabilistic models for performing this analysis, compared these models using computer simulation, and tested each model's actual performance, based on a large data set generated by human annotators specifically for this study. We show that even in the worst-case scenario, when all annotators disagree, our approach allows us to significantly increase the probability of choosing the correct annotation. Along with this publication we make publicly available a corpus of 10,000 sentences annotated according to several cardinal dimensions that we have introduced in earlier work. The 10,000 sentences were all 3-fold annotated by a group of eight experts, while a 1,000-sentence subset was further 5-fold annotated by five new experts. While the presented data represent a specialized curation task, our modeling approach is general; most data annotation studies could benefit from our methodology
Recommended from our members
How to Get the Most out of Your Curation Effort
Large-scale annotation efforts typically involve several experts who may disagree with each other. We propose an approach for modeling disagreements among experts that allows providing each annotation with a confidence value (i.e., the posterior probability that it is correct). Our approach allows computing certainty-level for individual annotations, given annotator-specific parameters estimated from data. We developed two probabilistic models for performing this analysis, compared these models using computer simulation, and tested each model's actual performance, based on a large data set generated by human annotators specifically for this study. We show that even in the worst-case scenario, when all annotators disagree, our approach allows us to significantly increase the probability of choosing the correct annotation. Along with this publication we make publicly available a corpus of 10,000 sentences annotated according to several cardinal dimensions that we have introduced in earlier work. The 10,000 sentences were all 3-fold annotated by a group of eight experts, while a 1,000-sentence subset was further 5-fold annotated by five new experts. While the presented data represent a specialized curation task, our modeling approach is general; most data annotation studies could benefit from our methodology.</p
Institutional Challenges in the Data Decade
Throughout the year, the DCC stages regional data management roadshows to present best practice and showcase new tools and resources. This article reports on the second roadshow, organised in conjunction with the White Rose University Consortium and held on 1-3 March 2011 at the University of Sheffield.
The goal for Day 1 was to describe the emerging trends and challenges associated with research data management and their potential impact on higher education institutions, and to introduce the Digital Curation Centre (DCC) and its role in supporting research data management. This was achieved through a substantial morning presentation followed by an afternoon of illustrative case studies at both disciplinary and institutional levels, highlighting different models, approaches and working practice. Day 2 was aimed at those in senior management roles and looked at strategic and policy implementation objectives. The Day 3 workshop explored data management requirements from the perspective of the institution and the main UK funding bodies, the different roles and responsibilities involved in effective data management and provided an introduction to data management planning. The portfolio of DCC resources, tools and services was explored in greater detail.
The roadshow provided delegates with advice and guidance to support institutional Research Data Management and has helped to facilitate regional networking and the exchange of skills and experience
Institutional Challenges in the Data Decade
Throughout the year, the DCC stages regional data management roadshows to present best practice and showcase new tools and resources. This article reports on the second roadshow, organised in conjunction with the White Rose University Consortium and held on 1-3 March 2011 at the University of Sheffield.
The goal for Day 1 was to describe the emerging trends and challenges associated with research data management and their potential impact on higher education institutions, and to introduce the Digital Curation Centre (DCC) and its role in supporting research data management. This was achieved through a substantial morning presentation followed by an afternoon of illustrative case studies at both disciplinary and institutional levels, highlighting different models, approaches and working practice. Day 2 was aimed at those in senior management roles and looked at strategic and policy implementation objectives. The Day 3 workshop explored data management requirements from the perspective of the institution and the main UK funding bodies, the different roles and responsibilities involved in effective data management and provided an introduction to data management planning. The portfolio of DCC resources, tools and services was explored in greater detail.
The roadshow provided delegates with advice and guidance to support institutional Research Data Management and has helped to facilitate regional networking and the exchange of skills and experience
Telescope Bibliographies: an Essential Component of Archival Data Management and Operations
Assessing the impact of astronomical facilities rests upon an evaluation of
the scientific discoveries which their data have enabled. Telescope
bibliographies, which link data products with the literature, provide a way to
use bibliometrics as an impact measure for the underlying data. In this paper
we argue that the creation and maintenance of telescope bibliographies should
be considered an integral part of an observatory's operations. We review the
existing tools, services, and workflows which support these curation
activities, giving an estimate of the effort and expertise required to maintain
an archive-based telescope bibliography.Comment: 10 pages, 3 figures, to appear in SPIE Astronomical Telescopes and
Instrumentation, SPIE Conference Series 844
Supporting emerging researchers in data management and curation
While scholarly publishing remains the key means for determining researchers’ impact, international funding body requirements and government recommendations relating to research data management (RDM), sharing and preservation mean that the underlying research data are becoming increasingly valuable in their own right. This is true not only for researchers in the sciences but also in the humanities and creative arts as well. The ability to exploit their own - and others’ - data is emerging as a crucial skill for researchers across all disciplines. However, despite Generation Y researchers being ‘highly competent and ubiquitous users of information technologies generally’ they appears to be a widespread lack of understanding and uncertainty about open access and self-archived resources (Jisc study, 2012). This chapter will consider the potential support that academic librarians might provide to support Generation Y researchers in this shifting research data landscape and examine the role of the library as part of institutional infrastructure.
The changing landscape will impact research libraries most keenly over the next few years as they work to develop infrastructure and support systems to identify and maintain access to a diverse array of research data outputs. However, the data that are being produced through research are no different to those being produced by artists, politicians and the general public. In this respect, all libraries - whether they be academic, national, or local - will need to be gearing up to ensure they are able to accept and provide access to an ever increasing range of complex digital objects
New ADS Functionality for the Curator
In this paper we provide an update concerning the operations of the NASA
Astrophysics Data System (ADS), its services and user interface, and the
content currently indexed in its database. As the primary information system
used by researchers in Astronomy, the ADS aims to provide a comprehensive index
of all scholarly resources appearing in the literature. With the current effort
in our community to support data and software citations, we discuss what steps
the ADS is taking to provide the needed infrastructure in collaboration with
publishers and data providers. A new API provides access to the ADS search
interface, metrics, and libraries allowing users to programmatically automate
discovery and curation tasks. The new ADS interface supports a greater
integration of content and services with a variety of partners, including ORCID
claiming, indexing of SIMBAD objects, and article graphics from a variety of
publishers. Finally, we highlight how librarians can facilitate the ingest of
gray literature that they curate into our system.Comment: Submitted to the Proceedings of Library and Information Services in
Astronomy VIII, Strasbourg, Franc
Video game preservation in the UK: a survey of records management practices
Video games are a cultural phenomenon; a medium like no other that has become one of the largest entertainment sectors in the world. While the UK boasts an enviable games development heritage, it risks losing a major part of its cultural output through an inability to preserve the games that are created by the country’s independent games developers. The issues go deeper than bit rot and other problems that affect all digital media; loss of context, copyright and legal issues, and the throwaway culture of the ‘next’ game all hinder the ability of fans and academics to preserve video games and make them accessible in the future.
This study looked at the current attitudes towards preservation in the UK’s independent (‘indie’) video games industry by examining current record-keeping practices and analysing the views of games developers. The results show that there is an interest in preserving games, and possibly a desire to do so, but issues of piracy and cost prevent the industry from undertaking preservation work internally, and from allowing others to assume such responsibility. The recommendation made by this paper is not simply for preservation professionals and enthusiasts to collaborate with the industry, but to do so by advocating the commercial benefits that preservation may offer to the industry
- …