2 research outputs found

    A Tool for Automatic Metadata Extraction and Schema Mapping for SEM Images

    Get PDF
    Standardized metadata and its proper storage are essential for effective management of scientific research data. The challenge lies in manually compiling such metadata, a process which can be both tedious and prone to human error. To address this problem, we introduce the Mapping Service, developed within the framework of HMC. The Mapping Service helps to streamline the process of metadata extraction and mapping according to existing community-agreed schemas. This tool has been designed as an adaptable and extensible service suitable across various research disciplines. It allows for add-ons which facilitate the extraction of metadata from otherwise proprietary and non-standard file formats and the mapping to schemas, which strengthens the metadata’s interoperability and reusability. The Mapping Service functions as a platform hosting a diverse suite of plugins. These plugins, each equipped with two primary components—a reader and a mapper—are instrumental in achieving metadata extraction and mapping for various experimental techniques. In many research environments, accessing metadata is a bottleneck due to proprietary formats that demand specific software, often leading researchers to manually transcribe unstructured and poorly-documented metadata embedded within, for example, research images. This manual approach, especially for large datasets comprising hundreds of files, is not only time-consuming but also introduces the potential for human errors. The Mapping Service elegantly addresses these challenges: the reader retrieves metadata from a set of diverse research data, such as images or metadata files, while the mapper discerningly selects key variables prescribed by the user-selected schema from the extracted metadata. These variables are then mapped to their respective schema names, resulting in a systematically formatted JSON metadata document. Through this poster, we showcase one use case via the mapping of Scanning Electron Microscopy (SEM)/Focused Ion Beam (FIB) tomography metadata to our published schema. This functionality is available as a plugin or “Mapping Component”on the Mapping Service and works to emphasize how a large and complex dataset con- taining a large amount of research images may be easily and efficiently transformed into a single metadata document using an intuitive user interface.Though the poster showcases the use case of SEM/FIB tomogra- phy, the Mapping Service has been designed to be a general-purpose tool. Additional plugins tailored to map from one arbitrary schema to another can easily be integrated, and a suite of such plugins is currently in development. Additionally, the service’s standalone web service architecture and user interface ensures ease of adoption without any local dependencies or installations required by end users. In summary, researchers across various fields seeking a streamlined approach to consistent metadata process- ing will find the Mapping Service to be a useful tool. With this poster, we aim to illuminate the advantages of the Mapping Service’s automated extraction and mapping capabilities as well as the necessary considerations and prerequisites for its implementation in a research environment

    FAIR Data Commons / Essential Services and Tools for Metadata Management Supporting Science

    Get PDF
    A sophisticated ensemble of services and tools enables high-level research data and research metadata management in science. On a technical level, research datasets need to be registered, preserved, and made interactively accessible using repositories that meet the specific requirements of scientists in terms of flexibility and performance. These requirements are fulfilled by the Base Repo and the MetaStore of the KIT Data Manager Framework. In our data management architecture, data and metadata are represented as FAIR Digital Objects that are machine actionable. The Typed PID Maker and the FAIR Digital Object Lab provide support for the creation and management of data objects. Other tools enable editing of metadata documents, annotation of data and metadata, building collections of data objects, and creating controlled vocabularies. Information systems such as the Metadata Standards Catalog and the Data Collections Explorer help researchers select domain-specific metadata standards and schemas and identify data collections of interest. Infrastructure developers search the Catalog of Repository Systems for information on modern repository systems, and the FAIR Digital Object Cookbook for recipes for creating FAIR Digital Objects. Existing knowledge about metadata management, services, tools, and information systems has been applied to create research data management architectures for a variety of fields, including digital humanities, materials science, biology, and nanoscience. For Scanning Electron Microscopy, Transmission Electron Microscopy and Magnetic Resonance Imaging, metadata schemas were developed in close cooperation with the domain specialists and incorporated in the research data management architectures. This research has been supported by the research program ‘Engineering Digital Futures’ of the Helmholtz Association of German Research Centers, the Helmholtz Metadata Collaboration (HMC) Platform, the German National Research Data Infrastructure (NFDI), the German Research Foundation (DFG) and the Joint Lab “Integrated Model and Data Driven Materials Characterization (MDMC)”. Also, this project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 101007417 within the framework of the NFFA-Europe Pilot (NEP) Joint Activities
    corecore