5,239 research outputs found
Towards a metadata standard for field spectroscopy
This thesis identifies the core components for a field spectroscopy metadata standard to facilitate discoverability, interoperability, reliability, quality assurance and extended life cycles for datasets being exchanged in a variety of data sharing platforms. The research is divided into five parts: 1) an overview of the importance of field spectroscopy, metadata paradigms and standards, metadata quality and geospatial data archiving systems; 2) definition of a core metadataset critical for all field spectroscopy applications; 3) definition of an extended metadataset for specific applications; 4) methods and metrics for assessing metadata quality and completeness in spectral data archives; 5) recommendations for implementing a field spectroscopy metadata standard in data warehouses and âbig dataâ environments. Part 1 of the thesis is a review of the importance of field spectroscopy in remote sensing; metadata paradigms and standards; field spectroscopy metadata practices, metadata quality; and geospatial data archiving systems. The unique metadata requirements for field spectroscopy are discussed. Conventional definitions and metrics for measuring metadata quality are presented. Geospatial data archiving systems for data warehousing and intelligent information exchange are explained. Part 2 of the thesis presents a core metadataset for all field spectroscopy applications, derived from the results of an international expert panel survey. The survey respondents helped to identify a metadataset critical to all field spectroscopy campaigns, and for specific applications. These results form the foundation of a field spectroscopy metadata standard that is practical, flexible enough to suit the purpose for which the data is being collected, and/or has sufficient legacy potential for long-term sharing and interoperability with other datasets. Part 3 presents an extended metadataset for specific application areas within field spectroscopy. The key metadata is presented for three applications: tree crown, soil, and underwater coral reflectance measurements. The performance of existing metadata standards in complying with the field spectroscopy metadataset was measured. Results show they consistently fail to accommodate the needs of both field spectroscopy scientists in general as well as the three application areas. Part 4 presents criteria for measuring the quality and completeness of field spectroscopy metadata in a spectral archive. Existing methods for measuring quality and completeness of metadata were scrutinized against the special requirements of field spectroscopy datasets. Novel field spectroscopy metadata quality parameters were defined. Two spectral libraries were examined as case studies of operationalized metadata. The case studies revealed that publicly available datasets are underperforming on the quality and completeness measures. Part 5 presents recommendations for adoption and implementation of a field spectroscopy standard, both within the field spectroscopy community and within the wider scope of IT infrastructure for storing and sharing field spectroscopy metadata within data warehouses and big data environments. The recommendations are divided into two main sections: community adoption of the standard, and integration of standardized metadatasets into data warehouses and big data platforms. This thesis has identified the core components of a metadata standard for field spectroscopy. The metadata standard serves overall to increase the discoverability, reliability, quality, and life cycle of field spectroscopy metadatasets for wide-scale data exchange
Theory and Practice of Data Citation
Citations are the cornerstone of knowledge propagation and the primary means
of assessing the quality of research, as well as directing investments in
science. Science is increasingly becoming "data-intensive", where large volumes
of data are collected and analyzed to discover complex patterns through
simulations and experiments, and most scientific reference works have been
replaced by online curated datasets. Yet, given a dataset, there is no
quantitative, consistent and established way of knowing how it has been used
over time, who contributed to its curation, what results have been yielded or
what value it has.
The development of a theory and practice of data citation is fundamental for
considering data as first-class research objects with the same relevance and
centrality of traditional scientific products. Many works in recent years have
discussed data citation from different viewpoints: illustrating why data
citation is needed, defining the principles and outlining recommendations for
data citation systems, and providing computational methods for addressing
specific issues of data citation.
The current panorama is many-faceted and an overall view that brings together
diverse aspects of this topic is still missing. Therefore, this paper aims to
describe the lay of the land for data citation, both from the theoretical (the
why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association
for Information Science and Technology (JASIST), 201
A posteriori metadata from automated provenance tracking: Integration of AiiDA and TCOD
In order to make results of computational scientific research findable,
accessible, interoperable and re-usable, it is necessary to decorate them with
standardised metadata. However, there are a number of technical and practical
challenges that make this process difficult to achieve in practice. Here the
implementation of a protocol is presented to tag crystal structures with their
computed properties, without the need of human intervention to curate the data.
This protocol leverages the capabilities of AiiDA, an open-source platform to
manage and automate scientific computational workflows, and TCOD, an
open-access database storing computed materials properties using a well-defined
and exhaustive ontology. Based on these, the complete procedure to deposit
computed data in the TCOD database is automated. All relevant metadata are
extracted from the full provenance information that AiiDA tracks and stores
automatically while managing the calculations. Such a protocol also enables
reproducibility of scientific data in the field of computational materials
science. As a proof of concept, the AiiDA-TCOD interface is used to deposit 170
theoretical structures together with their computed properties and their full
provenance graphs, consisting in over 4600 AiiDA nodes
IVOA Recommendation: Resource Metadata for the Virtual Observatory Version 1.12
An essential capability of the Virtual Observatory is a means for describing
what data and computational facilities are available where, and once
identified, how to use them. The data themselves have associated metadata
(e.g., FITS keywords), and similarly we require metadata about data collections
and data services so that VO users can easily find information of interest.
Furthermore, such metadata are needed in order to manage distributed queries
efficiently; if a user is interested in finding x-ray images there is no point
in querying the HST archive, for example. In this document we suggest an
architecture for resource and service metadata and describe the relationship of
this architecture to emerging Web Services standards. We also define an initial
set of metadata concepts
Recommended from our members
Role of brain imaging in disorders of brain-gut interaction: a Rome Working Team Report.
Imaging of the living human brain is a powerful tool to probe the interactions between brain, gut and microbiome in health and in disorders of brain-gut interactions, in particular IBS. While altered signals from the viscera contribute to clinical symptoms, the brain integrates these interoceptive signals with emotional, cognitive and memory related inputs in a non-linear fashion to produce symptoms. Tremendous progress has occurred in the development of new imaging techniques that look at structural, functional and metabolic properties of brain regions and networks. Standardisation in image acquisition and advances in computational approaches has made it possible to study large data sets of imaging studies, identify network properties and integrate them with non-imaging data. These approaches are beginning to generate brain signatures in IBS that share some features with those obtained in other often overlapping chronic pain disorders such as urological pelvic pain syndromes and vulvodynia, suggesting shared mechanisms. Despite this progress, the identification of preclinical vulnerability factors and outcome predictors has been slow. To overcome current obstacles, the creation of consortia and the generation of standardised multisite repositories for brain imaging and metadata from multisite studies are required
Recommended from our members
White paper â On the use of LiDAR data at AmeriFlux sites
Our aim is to inform the AmeriFlux community on existing and upcoming LiDAR technologies (atmospheric Doppler
or Raman LiDAR often deployed at flux sites are not considered here), how it is currently used at flux sites, and how
we believe it could, in the future, further contribute to the AmeriFlux vision. Heterogeneity in vegetation and ground
properties at various spatial scales is omnipresent at flux sites, and 3D mapping of canopy, understory, and ground
surface can help move the science forward
Software tools for conducting bibliometric analysis in science: An up-to-date review
Bibliometrics has become an essential tool for assessing and analyzing the output of scientists, cooperation between
universities, the effect of state-owned science funding on national research and development performance and educational
efficiency, among other applications. Therefore, professionals and scientists need a range of theoretical and practical
tools to measure experimental data. This review aims to provide an up-to-date review of the various tools available
for conducting bibliometric and scientometric analyses, including the sources of data acquisition, performance analysis
and visualization tools. The included tools were divided into three categories: general bibliometric and performance
analysis, science mapping analysis, and libraries; a description of all of them is provided. A comparative analysis of the
database sources support, pre-processing capabilities, analysis and visualization options were also provided in order to
facilitate its understanding. Although there are numerous bibliometric databases to obtain data for bibliometric and
scientometric analysis, they have been developed for a different purpose. The number of exportable records is between
500 and 50,000 and the coverage of the different science fields is unequal in each database. Concerning the analyzed
tools, Bibliometrix contains the more extensive set of techniques and suitable for practitioners through Biblioshiny.
VOSviewer has a fantastic visualization and is capable of loading and exporting information from many sources. SciMAT
is the tool with a powerful pre-processing and export capability. In views of the variability of features, the users need to
decide the desired analysis output and chose the option that better fits into their aims
A global soil spectral calibration library and estimation service
There is growing global interest in the potential for soil reflectance spectroscopy to fill an urgent need for more data on soil properties for improved decision-making on soil security at local to global scales. This is driven by the capability of soil spectroscopy to estimate a wide range of soil properties from a rapid, inexpensive, and highly reproducible measurement using only light. However, several obstacles are preventing wider adoption of soil spectroscopy. The biggest obstacles are the large variation in the soil analytical methods and operating procedures used in different laboratories, poor reproducibility of analyses within and amongst laboratories and a lack of soil physical archives. In addition, adoption is hindered by the expense and complexity of building soil spectral libraries and calibration models. The Global Soil Spectral Calibration Library and Estimation Service is proposed to overcome these obstacles by providing a freely available estimation service based on an open, high quality and diverse spectral calibration library and the extensive soil archives of the Kellogg Soil Survey Laboratory (KSSL) of the Natural Resources Conservation Service of the United States Department of Agriculture (USDA). The initiative is supported by the Global Soil Laboratory Network (GLOSOLAN) of the Global Soil Partnership and the Soil Spectroscopy for Global Good network, which provide additional support through dissemination of standards, capacity development and research. This service is a global public good which stands to benefit soil assessments globally, but especially developing countries where soil data and resources for conventional soil analyses are most limited
- âŠ