29 research outputs found
Data Collections Explorer
For research data to be used efficiently, it must be easy to find and access. This is a requirement in all areas of science. The Data Collections Explorer, developed within NFDI4Ing for the engineering sciences, targets these needs. It is an information system that provides an overview of research data repositories, archives, databases as well as individual datasets published in the field. Two use cases are considered:
1. Scientists searching for data sets. Are there datasets available to aid in your research? Are there benchmarks available to check your results? Are these datasets available under an open access license?
2. Scientists aiming to publish data sets: Among community-specific repositories, which ones are suitable to publish the research data? Do repositories restrict the size of the datasets that can be uploaded and if so, what are the limits? Are publication fees charged and if so, how much is charged?
To facilitate answering these questions, the Data Collections Explorer provides both a free text search and filters for type of service, subject area, and access license. Where appropriate and available, information on data size limits and publishing fees is provided.
The Data Collections Explorer complements re3data, as it includes entries which are outside its scope or which are not listed.
This concept is not limited to the engineering sciences. To broaden the impact, we are currently working on expanding the Data Collections Explorer to the material sciences and engineering community within NFDI- MatWerk.
This work is supported by the NFDI4Ing consortium (DFG â project number 442146713), the German Research Foundation (DFG), the research program âEngineering Digital Futuresâ by the Helmholtz Research Association and the Helmholtz Metadata Collaboration (HMC) Platform
A Lightning Introduction to the NFDI4Ing Data Collections Explorer
These are slides for a short lightning talk given at the TS4NFDI online workshop on 15. February 2024. They give a very short introduction to what the Data Collections Explorer is and the current state of the work
A Short Introduction to the Data Collections Explorer
This is a short introduction to the Data Collections Explorer presented as part of the NFDI4Ing SIG RDM Training & Education Meeting on 6.10.2023
Data Collections Explorer â An Easy-to-Use Tool for Sharing and Discovering Research Data
There is a wide variety of archives, databases, and repositories currently available that pro-vide access to research data. However, basic information about these systems is often diffi-cult to gather, such as whether there are limits to the size of data sets that can be published or whether there is any publication fee that applies. In addition to that, there are plenty of re-search groups publishing their research data sets independently of these infrastructures, making it difficult for scientists to find them since they are not centrally registered. Research data must be easily discoverable and accessible for scientists to use it effectively. The Data Collections Explorer, developed within the national research data infrastructure for the engineering sciences NFDI4Ing, is an easy-to-use information system addressing these needs. It is a low threshold information system that provides an overview of research data repositories, archives, databases as well as individually published data sets. Similar systems exist in other subject areas, for example the Data Repository Finder focusing on the medi-cal, life and social sciences. Contrary to the Data Collections Explorer, the Data Repository Finder only lists repositories
Data Collections Explorer â An Information System for the Engineering Sciences
The poster provides an overview of and an introduction to the Data Collections Explorer. It is an information system for engineering related resources, such as repositories, databases, archives, as well as datasets published independent of these structures. In addition, the poster describes current developments and provides an outlook on future work.
The target audience are scientists either searching for data sets or looking to publish their data sets.
Seeking to share or publish their research data sets, scientists are confronted with questions such as: what is an appropriate repository for publishing the data sets? Do repositories have size restrictions with respect to the data sets that can be uploaded? What is this limit? Are publication fees charged?
Searching for data sets, scientists are often confronted with questions such as: are data sets available to aid in research? Are benchmarks available to check results? Are these data sets available under an open access license?
The Data Collections Explorer is a low-threshold web service to share and discover research data. To help get an overview and answer the above questions, it provides a free-text search as well as drop-down menus to filter its entries for the following criteria:
⢠Hosting Institution
⢠Type of Service
⢠Subject Area
⢠Open Access
A stable version is currently in use by NFDI4Ing (https://data-collections.nfdi4ing.de). The current development version features a graph-based data model. A preview version will be made available later this year. This knowledge graph can be accessed by a SPARQL interface. It provides more flexibility than the currently deployed version
A Controlled Vocabulary for Acronyms of NFDI-MatWerk Using the Vocabulary Service EVOKS
Controlled vocabularies are comprehensive collections of domain-specific terms used to describe knowledge within a specific field. They help overcome data ambiguity and offer benefits such as referencing term definitions, promoting semantic interoperability, and facilitating the integration of ontologies. EVOKS is a general-purpose vocabulary service that enables data stewards and scientists to easily create, edit, curate, and publish controlled vocabularies using the W3C recommended SKOS data model. The service ensures access to published vocabularies through the implementation of SKOSMOS, a dedicated vocabulary browser instance. NFDI-MatWerk provides a publicly accessible SKOSMOS instance for publishing vocabularies and an EVOKS instance for editing vocabularies exclusively for its members. The poster showcases the MatWerk Acronyms Vocabulary as an example and highlights the key features and benefits of using EVOKS, including collaborative work, the application of persistent URLs to vocabulary terms, and the integration of controlled vocabularies into metadata schemas
Using EVOKS to build controlled vocabularies
Controlled vocabularies are used to describe knowledge within a particular domain, encompassing a comprehensive collection of domain specific terms. Using controlled vocabularies not only mitigates the challenge of data ambiguity, but also offers several advantages, including references to term definitions, particularly within Metadata Schemas. Additionally, they foster semantic interoperability and facilitate the seamless integration of ontologies.
EVOKS, the Editor for Vocabularies to Know Semantics, is a general-purpose vocabulary service which allows data stewards and scientists to easily create or import, edit, curate and publish controlled vocabularies using the W3C recommended SKOS data model [1]. Access to published vocabularies is effectively ensured through the implementation of SKOSMOS [2] as dedicated vocabulary browser instance.
We provide a publicly accessible SKOSMOS instance for NFDI-MatWerk [3]. There, it is possible to publish vocabularies created in the NFDI-MatWerk. Additionally, members of NFDI-MatWerk have access to an exemplary EVOKS instance for editing vocabularies [4].
Taking the MatWerk Acronyms Vocabulary [5] as an example, the poster shows basic usage and benefits of using EVOKS. In particular:
⢠Creating and editing of controlled vocabularies
⢠Collaboratively working on vocabularies
⢠Advantages of applying persistent URLs to the vocabulary and vocabulary terms
The NFDI-MatWerk Acronyms Vocabulary consists of roughly 70 terms and required approximately two working days to be created.
As an application use case, the poster also demonstrates the integration of controlled vocabularies into metadata schemas. Specifically, it illustrates how the narrower concepts of a term from a controlled vocabulary appear as selectable options in the schema\u27s drop-down menu, when a metadata editor interface is configured.
The EVOKS interface is designed to be very intuitive and user-friendly: users can quickly familiarise themselves with the platform and navigate its features without significant time investment. This service speeds up and facilitates the process of creating controlled vocabularies, which allows for a common understanding of terms, eases interoperability, and contributes to FAIRness.
This work has been supported by NFDI-MatWerk (DFG â n. 460247524), NFDI4Ing (DFG â n. 442146713), NFFA-Europe-Pilot (EU H2020 â n. 101007417), and by the Helmholtz Research Association with the research program âEngineering Digital Futuresâ and the Helmholtz Metadata Collaboration (HMC) platform.
References
[1] http://www.w3.org/TR/skos-reference
[2] https://skosmos.org/
[3] https://matwerk.datamanager.kit.edu/evoks/
[4] https://evoks.matwerk.datamanager.kit.edu
[5] https://purls.helmholtz-metadaten.de/evoks/MatWerkAcronyms
NFDI-MatWerk - Reference Datasets
Within NFDI-MatWerk (âNational Research Data Infrastructure for Material Sciencesâ/ âNationale Forschungsdateninfrastruktur fĂźr Materialwissenschaften und Werkstofftechnikâ), the Task Area Materials Data Infrastructure (TA-MDI) will provide tools and services to easily store, share, search, and analyze data and metadata. Such a digital materials environment will ensure data integrity, provenance, and authorship. The MatWerk consortium aims to develop specific solutions jointly with Participant Projects (PPs), which are scientific groups or institutes covering different domains, from theory and simulations to experiments. The Data Exploitation Methods group of the Karlsruhe Institute of Technology-Steinbuch Centre of Computing, as part of TA-MDI, is developing specific solutions in close collaboration with three PPs.
PP07, together with the University of Stuttgart, aims at the image-based prediction of the material properties of stochastic microstructures using high-performance solvers and machine learning. PP13, in cooperation with the University of Saarland, focuses on tomographic methods at various scales in materials research. PP18, together with the Federal Institute for Materials Research and Testing (âBundesanstalt fĂźr Materialforschung und -prĂźfungâ), aspires to define the criteria for materials reference datasets and usage analytics. The requirements and goals are comparable for each PP: their research outputs, which are scientific datasets, should conform to the FAIR (Findable, Accessible, Interoperable, Reusable) principles. We aim to shape them from a data management perspective making use of the FAIR Digital Object concept, including structured metadata and storage solutions. The results will be a blueprint which will act as a reference for future datasets. Even though the collaboration is in an early stage, the initial steps already show the added value of this approach