32 research outputs found

    A choice of persistent identifier schemes for the Distributed System of Scientific Collections (DiSSCo)

    Get PDF
    Persistent identifiers (PID) to identify digital representations of physical specimens in natural science collections (i.e., digital specimens) unambiguously and uniquely on the Internet are one of the mechanisms for digitally transforming collections-based science. Digital Specimen PIDs contribute to building and maintaining long-term community trust in the accuracy and authenticity of the scientific data to be managed and presented by the Distributed System of Scientific Collections (DiSSCo) research infrastructure planned in Europe to commence implementation in 2024. Not only are such PIDs valid over the very long timescales common in the heritage sector but they can also transcend changes in underlying technologies of their implementation. They are part of the mechanism for widening access to natural science collections. DiSSCo technical experts previously selected the Handle System as the choice to meet core PID requirements. Using a two-step approach, this options appraisal captures, characterises and analyses different alternative Handle-based PID schemes and the possible operational modes of use. In a first step a weighting and ranking the options has been applied followed by a structured qualitative assessment of social and technical compliance across several assessment dimensions: levels of scalability, community trust, persistence, governance, appropriateness of the scheme and suitability for future global adoption. The results are discussed in relation to branding, community perceptions and global context to determine a preferred PID scheme for DiSSCo that also has potential for adoption and acceptance globally. DiSSCo will adopt a ‘driven-by DOI’ persistent identifier (PID) scheme customised with natural sciences community characteristics. Establishing a new Registration Agency in collaboration with the International DOI Foundation is a practical way forward to support the FAIR (findable, accessible interoperable, reusable) data architecture of DiSSCo research infrastructure. This approach is compatible with the policies of the European Open Science Cloud (EOSC) and is aligned to existing practices across the global community of natural science collections

    History and development of ABCDEFG: a data standard for geosciences

    Get PDF
    Museums and their collections have specially customized databases in order to optimally gather and record their contents and associated metadata associated with their specimens. To share, exchange, and publish data, an appropriate data standard is essential. ABCD (Access to Biological Collection Data) is a standard for biological collection units, including living and preserved specimen, together with field observation data. Its extension, EFG (Extension for Geoscience), enables sharing and publishing data related to paleontological, mineralogical, and petrological objects. The standard is very granular and allows detailed descriptions, including information about the collection event itself, the holding institution, stratigraphy, chemical analysis, and host rock. The standard extension was developed in 2006 and has been used since then by different initiatives and applied for the publication of collection-related data in\ud domain-specific and interdisciplinary portals

    Enabling Digital Specimen and Extended Specimen Concepts in Current Tools and Services

    No full text
    Digital specimens (Hardisty 2018, Hardisty 2020) are the cyberspace equivalent to objects in a physical, often museum-based collection. They consist of references to data and metadata related to the collection object. Through the ongoing process of digitizing legacy data, gaining knowledge from new field collections or research, and annotating and linking to related resources, a digital specimen can evolve independently from the original physical object. Especially the provenance records cannot always be assigned to the physical object when the knowledge was gained solely from the digital representation.A physical specimen can also be understood as a physical preparation (or a set of multiple preparations, e.g. DNA samples taken from a preserved organism) accompanied by related digital and non-digital data sources (e.g. images, descriptions in fieldbooks, research data) rather than just a single object. This concept of an extended specimen has been described by Webster (2017) and is used in the initiative The Extended Specimen Network (Lendemer et al. 2019) to enhance the access and research potential of specimens.Digital specimens need to reflect both, eventual complexity of the physical object (extended specimen) and the knowledge gained from and linked to the digital object itself. In order to provide, track and make use of the digital specimens, the community of collection-holding institutions might need to think of digital specimens as standalone virtual collections that emanate from physical collections. Additionally, new versions of a digital specimen continuously derive from changes of the physical specimen as the (meta)data are being updated in collection management systems to document the state and treatment of the physical objects. Consequently, there is a challenge to enable the management of both: linked digital specimens in the World Wide Web and the local data of physical specimens in databases of collection-holding institutions and other tools and services.In this panel discussion, central questions about the requirements, obstacles and opportunities of implementing the concepts of digital specimens and extended specimens in software tools like collection management systems are discussed. The aim is to identify the major tasks and priorities regarding the transformation of tools and services from multiple perspectives: local collection data management, international data infrastructures like the Distributed System of Scientific Collections (DiSSCo) and the Global Biodiversity Information Facility (GBIF), and data usage outside of domain-specific subject areas

    Plenary Discussion - Future of Collection Management Systems

    No full text
    The DINA Symposium ("DIgital information system for NAtural history data", https://dina-project.net) ends with a plenary session involving the audience to discuss the interplay of collection management and software tools. The discussion will touch different areas and issues such as: (1) Collection management using modern technology: How should and could collections be managed using current technology – What is the ultimate objective of using a new collection management system? How should traditional management processes be changed? (2) Development and community Why are there so many collection management systems? Why is it so difficult to create one system that fits everyone's requirements? How could the community of developers and collection staff be built around DINA project in the future? (3) Features and tools How to identify needs that are common to all collections? What are the new tools and technologies that could facilitate collection management? How could those tools be implemented as DINA compliant services? (4) Data What data must be captured about collections and specimens? What criteria need to be applied in order to distinguish essential and "nice-to-have" information? How should established data standards (e.g. Darwin Core & ABCD (Access to Biological Collection Data)) be used to share data from rich and diverse data models? In addition to the plenary discussion around these questions, we will agree on a streamlined format for continuing the discussion in order to write a white paper on these questions. The results and outcome of the session will constitute the basis of the paper and will be subsequently refined

    The Data Standard ABCD EFG - Access to Biological Collection Data Extended for Geosciences

    No full text
    The data schema ABCD (Access to Biological Collection Data version 2.06) is a standard for biological collection units, including living and preserved specimen, together with field observation data. Its extension EFG (Extension for Geosciences) is suitable for sharing and publishing data related to paleontological, mineralogical, and petrological objects. In addition to detailed object descriptions and collection events, ABCD EFG provides fine-grained data structures for information on stratigraphy, chemical analyses and host rock composition. The comprehensive EFG was developed in 2006. Since then it has been used by different initiatives, including the publication of collection-related data in domain-specific and interdisciplinary portals such as GBIF, GeoCASe, GFBio and Europeana. The TDWG Paleo Interest Group meeting 2017 will include a focus on the relationship between Darwin Core and ABCD EFG, and following that theme this presentation will give an introduction to the current state of ABCD EFG, its common use cases and an outlook towards the next version ABCD 3.0. We expect that the ABCD EFG terms are suitable for embedding elements into the Darwin Core Paleo Context. Thus, this talk will also initiate the discussion about the most important elements for further use cases from the paleobiological community

    The BioCASe Monitor Service - A tool for monitoring progress and quality of data provision through distributed data networks

    Get PDF
    The BioCASe Monitor Service (BMS) is a web-based tool for coordinators of distributed data networks that provide information to web-portals and data aggregators via the BioCASe Provider Software. Building on common standards and protocols, it has three main purposes: (1) monitoring provider’s progress in data provision, (2) facilitating checks of data mappings with a focus on the structure, plausibility and completeness, and (3) verifying compliance of provided data for transformation into other target schemas. Herein two use cases, GBIF-D and OpenUp!, are presented in which the BMS is being applied for monitoring the progress in data provision and performing quality checks on the ABCD (Access to Biological Collection Data) schema mapping. However, the BMS can potentially be used with any conceptual data schema and protocols for querying web services. Through flexible configuration options it is highly adaptable to specific requirements and needs. Thus, the BMS can be easily implemented into coordination workflows and reporting duties within other distributed data network projects

    Harmonizing plot data with collection data

    Get PDF
    Although plot or monitoring data are quite often associated with objects collected in the plot and stored in specific collections, controlled vocabularies currently available do not cover both disciplines. This situation limits the possibility to publish common data sets and consequently brings a loss of significant information by combining plot-based research with collection object associated data. To facilitate the exchange and publication of these important data sets, experts in natural history collection data, ecological research, and environmental science met for a one-day workshop in Berlin. The participants discussed data standards and ontologies relevant for each discipline and collected requirements for a first application schema covering terms important for both, collection object related data and plot-based research

    Access to Geosciences – Ways and Means to share and publish collection data

    Get PDF
    Natural history collections are invaluable tools for various questions regarding biodiversity, environmental, and cultural studies. All object metadata thus need to be findable, reachable and interoperable for the scientific community and beyond. This requires a good structuration of data, appropriate exchange formats, and web sites or portals making all necessary information accessible. Collection managers, curators, and scientist from various institutions and nationalities were surveyed in order to understand the importance of open geoscientific collections for the respective holding institution and their daily work. In addition, particular requirements for the publication of geoscientific collection object metadata were gathered in a two-day workshop with international experts working with paleontological, mineralogical, petrological and meteorite collections. The survey and workshop revealed that common data standards are of crucial importance though insufficiently used by most institutions. The extent and type of information necessary for the publication and discussed during the workshop will be considered for domain specific application schema facilitating the publication and exchange of geoscientific object metadata. There is a high demand for comprehensive data portals covering all geoscientific disciplines. Gathered portal requirements will be taken into account when improving the already running GeoCASe aggregator platform

    I Know Something You Don't Know: The annotation saga continues


    No full text
    Over the past 20 years, the biodiversity informatics community has pursued components of the digital annotation landscape with varying degrees of success. We will provide an historical overview of the theory, the advancements made through a few key projects, and will identify some of the ongoing challenges and opportunities. The fundamental principles remain unchanged since annotations were first proposed. Someone (or something): (1) has an enhancement to make elsewhere from the source where original data or information are generated or transcribed; (2) wishes to broadcast these statements to the originator and to others who may benefit; and (3) expects persistence, discoverability, and attribution for their contributions alongside the source.The Filtered Push project (Morris et al. 2013) considered several use cases and pioneered development of services based on the technology of the day. The exchange of data between parties in a universally consistent way necessitated the development of a novel draft standard for data annotations via an extension of the World Wide Web Consortium's Web Annotation Working Group standard (Sanderson et al. 2013) to be sufficiently informative for a data curator to confidently make a decision. Figure 2 from Morris et al. (2013), reproduced here as Fig. 1, outlines the composition of an annotation data package for a taxonomic identification. The package contains the data object(s) associated with an occurrence, an expression of the motivation(s) for updating, some evidence for an assertion, and a stated expectation for how the receiving entity should take action. The Filtered Push and Annosys (Tschöpe et al. 2013) projects also considered implementation strategies involving collection management systems (e.g., Symbiota) and portals (e.g., European Distributed Institute of Taxonomy, EDIT). However, there remain technological barriers for these systems to operate at scale, the least of which is the absence of globally unique, persistent, resolvable identifiers for shared objects and concepts.Major aggregation infrastructures like the Global Biodiversity Information Facility (GBIF) and the Distributed System of Scientific Collections (DiSSCo) rely on data enhancement to improve the quality of their resources and have annotation services in their work plans. More recently, the Digital Extended Specimen (DES) concept (Hardisty et al. 2022) will rely on annotation services as key components of the proposed infrastructure. Recent work on annotation services more generally has considered various new forms of packaging and delivery such as Frictionless Data (Fowler et al. 2018), Journal Article Tag Suite XML (Agosti et al. 2022), or nanopublications (Kuhn et al. 2018). There is risk in fragmentation of this landscape and disenfranchisement of both biological collections and the wider research community if we fail to align the purpose, content, and structure of these packages or if these fail to remain aligned with FAIR principles.Institutional collection management systems currently represent the canonical data store that provides data to researchers and data aggregators. It is critical that information and/or feedback about the data they release be round-tripped back to them for consideration. However, the sheer volume of annotations that could be generated by both human and machine curation processes will overwhelm local data curators and the systems supporting them. One solution to this is to create a central annotation store with write and discovery services that best support the needs of all stewards of data. This will require an international consortium of parties with a governance and technical model to assure its sustainability
    corecore