22,853 research outputs found
Analysis and Synthesis of Metadata Goals for Scientific Data
The proliferation of discipline-specific metadata schemes contributes to artificial barriers that can impede interdisciplinary and transdisciplinary research. The authors considered this problem by examining the domains, objectives, and architectures of nine metadata schemes used to document scientific data in the physical, life, and social sciences. They used a mixed-methods content analysis and Greenbergâs (2005) metadata objectives, principles, domains, and architectural layout (MODAL) framework, and derived 22 metadata-related goals from textual content describing each metadata scheme. Relationships are identified between the domains (e.g., scientific discipline and type of data) and the categories of scheme objectives. For each strong correlation (\u3e0.6), a Fisherâs exact test for nonparametric data was used to determine significance (p \u3c .05).
Significant relationships were found between the domains and objectives of the schemes. Schemes describing observational data are more likely to have âscheme harmonizationâ (compatibility and interoperability with related schemes) as an objective; schemes with the objective âabstractionâ (a conceptual model exists separate from the technical implementation) also have the objective âsufficiencyâ (the scheme defines a minimal amount of information to meet the needs of the community); and schemes with the objective âdata publicationâ do not have the objective âelement refinement.â The analysis indicates that many metadata-driven goals expressed by communities are independent of scientific discipline or the type of data, although they are constrained by historical community practices and workflows as well as the technological environment at the time of scheme creation. The analysis reveals 11 fundamental metadata goals for metadata documenting scientific data in support of sharing research data across disciplines and domains. The authors report these results and highlight the need for more metadata-related research, particularly in the context of recent funding agency policy changes
Recommended from our members
Open Science principles for accelerating trait-based science across the Tree of Life.
Synthesizing trait observations and knowledge across the Tree of Life remains a grand challenge for biodiversity science. Species traits are widely used in ecological and evolutionary science, and new data and methods have proliferated rapidly. Yet accessing and integrating disparate data sources remains a considerable challenge, slowing progress toward a global synthesis to integrate trait data across organisms. Trait science needs a vision for achieving global integration across all organisms. Here, we outline how the adoption of key Open Science principles-open data, open source and open methods-is transforming trait science, increasing transparency, democratizing access and accelerating global synthesis. To enhance widespread adoption of these principles, we introduce the Open Traits Network (OTN), a global, decentralized community welcoming all researchers and institutions pursuing the collaborative goal of standardizing and integrating trait data across organisms. We demonstrate how adherence to Open Science principles is key to the OTN community and outline five activities that can accelerate the synthesis of trait data across the Tree of Life, thereby facilitating rapid advances to address scientific inquiries and environmental issues. Lessons learned along the path to a global synthesis of trait data will provide a framework for addressing similarly complex data science and informatics challenges
Skills and Knowledge for Data-Intensive Environmental Research.
The scale and magnitude of complex and pressing environmental issues lend urgency to the need for integrative and reproducible analysis and synthesis, facilitated by data-intensive research approaches. However, the recent pace of technological change has been such that appropriate skills to accomplish data-intensive research are lacking among environmental scientists, who more than ever need greater access to training and mentorship in computational skills. Here, we provide a roadmap for raising data competencies of current and next-generation environmental researchers by describing the concepts and skills needed for effectively engaging with the heterogeneous, distributed, and rapidly growing volumes of available data. We articulate five key skills: (1) data management and processing, (2) analysis, (3) software skills for science, (4) visualization, and (5) communication methods for collaboration and dissemination. We provide an overview of the current suite of training initiatives available to environmental scientists and models for closing the skill-transfer gap
Preserving today for tomorrow: A case study of an archive of Interactive Music Installations
This work presents the problems addressed
and the first results obtained by a project aimed at
the preservation of Interactive Music Installations (IMI).
Preservation requires that besides all the necessary components
for the (re)production of a performance, also the
knowledge about these components is kept, so that the
original process can be repeated at any given time. This
work proposes a multilevel approach for the preservation
of IMI. As case studies, the Pinocchio Square (installed in
EXPO 2002) and the Il Caos delle Sfere are considered
On-the-fly Data Assessment for High Throughput X-ray Diffraction Measurement
Investment in brighter sources and larger and faster detectors has
accelerated the speed of data acquisition at national user facilities. The
accelerated data acquisition offers many opportunities for discovery of new
materials, but it also presents a daunting challenge. The rate of data
acquisition far exceeds the current speed of data quality assessment, resulting
in less than optimal data and data coverage, which in extreme cases forces
recollection of data. Herein, we show how this challenge can be addressed
through development of an approach that makes routine data assessment automatic
and instantaneous. Through extracting and visualizing customized attributes in
real time, data quality and coverage, as well as other scientifically relevant
information contained in large datasets is highlighted. Deployment of such an
approach not only improves the quality of data but also helps optimize usage of
expensive characterization resources by prioritizing measurements of highest
scientific impact. We anticipate our approach to become a starting point for a
sophisticated decision-tree that optimizes data quality and maximizes
scientific content in real time through automation. With these efforts to
integrate more automation in data collection and analysis, we can truly take
advantage of the accelerating speed of data acquisition
- âŠ