111,441 research outputs found
Big data: the potential role of research data management and research data registries
Universities generate and hold increasingly vast quantities of research data – both in the form of large, well-structured datasets but more often in the form of a long tail of small, distributed datasets which collectively amount to ‘Big Data’ and offer significant potential for reuse. However, unlike big data, these collections of small data are often less well curated and are usually very difficult to find thereby reducing their potential reuse value. The Digital Curation Centre (DCC) works to support UK universities to better manage and expose their research data so that its full value may be realised. With a focus on tapping into this long tail of small data, this presentation will cover two main DCC, services: DMPonline which helps researchers to identify potentially valuable research data and to plan for its longer-term retention and reuse; and the UK pilot research data registry and discovery service (RDRDS) which will help to ensure that research data produced in UK HEIs can be found, understood, and reused.
Initially we will introduce participants to the role of data management planning to open up dialogue between researchers and library services to ensure potentially valuable research data are managed appropriately and made available for reuse where feasible. DMPs provide institutions with valuable insights into the scale of their data holdings, highlight any ethical and legal requirements that need to be met, and enable planning for dissemination and reuse. We will also introduce the DCC’s DMPonline, a tool to help researchers write DMPs, which can be customised by institutions and integrated with other systems to simplify and enhance the management and reuse of data.
In the second part of the presentation we will focus on making selected research data more visible for reuse and explore the potential value of local and national research data registries. In particular we will highlight the Jisc-funded RDRDS pilot to establish a UK national service that aggregates metadata relating to data collections held in research institutions and subject data centres. The session will conclude by exploring some of the opportunities we may collaboratively explore in facilitating the management, aggregation and reuse of research data
Seafloor characterization using airborne hyperspectral co-registration procedures independent from attitude and positioning sensors
The advance of remote-sensing technology and data-storage capabilities has progressed in the last decade to commercial multi-sensor data collection. There is a constant need to characterize, quantify and monitor the coastal areas for habitat research and coastal management. In this paper, we present work on seafloor characterization that uses hyperspectral imagery (HSI). The HSI data allows the operator to extend seafloor characterization from multibeam backscatter towards land and thus creates a seamless ocean-to-land characterization of the littoral zone
Global Innovations in Measurement and Evaluation
We researched the latest developments in theory and practice in measurement and evaluation. And we found that new thinking, techniques, and technology are influencing and improving practice. This report highlights 8 developments that we think have the greatest potential to improve evaluation and programme design, and the careful collection and use of data. In it, we seek to inform and inspire—to celebrate what is possible, and encourage wider application of these ideas
PANGAEA information system for glaciological data management
Specific parameters determined on cores from continental ice sheets or glaciers can be used to reconstruct former climate. To use this scientific resource effectively an information system is needed which guarantees consistent longtime storage of data and provides easy access for the scientific community.An information system to archive any data of paleoclimatic relevance, together with the related metadata, raw data and evaluated paleoclimatic data, is presented. The system, based on a relational database, provides standardized import and export routines, easy access with uniform retrieval functions, and tools for the visualization of the data. The network is designed as a client/server system providing access through the Internet with proprietary client software including a high functionality or read-only access on published data via the World Wide Web
Towards Exascale Scientific Metadata Management
Advances in technology and computing hardware are enabling scientists from
all areas of science to produce massive amounts of data using large-scale
simulations or observational facilities. In this era of data deluge, effective
coordination between the data production and the analysis phases hinges on the
availability of metadata that describe the scientific datasets. Existing
workflow engines have been capturing a limited form of metadata to provide
provenance information about the identity and lineage of the data. However,
much of the data produced by simulations, experiments, and analyses still need
to be annotated manually in an ad hoc manner by domain scientists. Systematic
and transparent acquisition of rich metadata becomes a crucial prerequisite to
sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and
domain-agnostic metadata management infrastructure that can meet the demands of
extreme-scale science is notable by its absence.
To address this gap in scientific data management research and practice, we
present our vision for an integrated approach that (1) automatically captures
and manipulates information-rich metadata while the data is being produced or
analyzed and (2) stores metadata within each dataset to permeate
metadata-oblivious processes and to query metadata through established and
standardized data access interfaces. We motivate the need for the proposed
integrated approach using applications from plasma physics, climate modeling
and neuroscience, and then discuss research challenges and possible solutions
The space physics environment data analysis system (SPEDAS)
With the advent of the Heliophysics/Geospace System Observatory (H/GSO), a complement of multi-spacecraft missions and ground-based observatories to study the space environment, data retrieval, analysis, and visualization of space physics data can be daunting. The Space Physics Environment Data Analysis System (SPEDAS), a grass-roots software development platform (www.spedas.org), is now officially supported by NASA Heliophysics as part of its data environment infrastructure. It serves more than a dozen space missions and ground observatories and can integrate the full complement of past and upcoming space physics missions with minimal resources, following clear, simple, and well-proven guidelines. Free, modular and configurable to the needs of individual missions, it works in both command-line (ideal for experienced users) and Graphical User Interface (GUI) mode (reducing the learning curve for first-time users). Both options have “crib-sheets,” user-command sequences in ASCII format that can facilitate record-and-repeat actions, especially for complex operations and plotting. Crib-sheets enhance scientific interactions, as users can move rapidly and accurately from exchanges of technical information on data processing to efficient discussions regarding data interpretation and science. SPEDAS can readily query and ingest all International Solar Terrestrial Physics (ISTP)-compatible products from the Space Physics Data Facility (SPDF), enabling access to a vast collection of historic and current mission data. The planned incorporation of Heliophysics Application Programmer’s Interface (HAPI) standards will facilitate data ingestion from distributed datasets that adhere to these standards. Although SPEDAS is currently Interactive Data Language (IDL)-based (and interfaces to Java-based tools such as Autoplot), efforts are under-way to expand it further to work with python (first as an interface tool and potentially even receiving an under-the-hood replacement). We review the SPEDAS development history, goals, and current implementation. We explain its “modes of use” with examples geared for users and outline its technical implementation and requirements with software developers in mind. We also describe SPEDAS personnel and software management, interfaces with other organizations, resources and support structure available to the community, and future development plans.Published versio
- …