Search CORE

712 research outputs found

Interoperability and FAIRness through a novel combination of Web technologies

Author: Bolleman Jerven T.
Bonino da Silva Santos Luiz Olavo
Ciccarese Paolo
Clark Tim
Dumontier Michel
Gavai Anand
Gray Alasdair J. G.
Kaliyaperumal Rajaram
Kelpin Fleur D. L.
Kuzniar Arnold
Schultes Erik A.
Swertz Morris A.
Thompson Mark
van Mulligen Erik M.
Verborgh Ruben
Wilkinson Mark D.
Publication venue: 'PeerJ'
Publication date: 01/01/2017
Field of study

Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved atthe level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs

Maastricht University Research Portal

Heriot Watt Pure

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Ghent University Academic Bibliography

Directory of Open Access Journals

Dissertations of the University of Groningen

The OBAA Standard for Developing Repositories of Learning Objects: the Case of Ocean Literacy in Azores

Author: Botelho Andreia Z. C.
Cascalho José M. V. R.
Costa Ana Cristina M. R.
Faria Cláudia
Mendes Armando B.
Neves Paulo Novo
Parente Manuela I.
Rossi Luíz H. L.
Vicarri Rosa Maria
Publication venue
Publication date: 01/01/2019
Field of study

This paper describes the existing web resources of learning objects to promote ocean literacy. The several projects and sites are explored, and the shortcomings revealed. The limitations identified include insufficient metadata about registered learning objects and lack of support for intelligent applications. As solution, we promote the seaThings project that relies on a multi-disciplinary approach to promote literacy in the marine environment by implementing a specific Learning Objects repositories (LOR) and a federation of repositories (FED), supported by a OBAA, a versatile and innovative standard that will provide the necessary support for intelligent applications for education purposes, to be used in schools and other educational institutions.info:eu-repo/semantics/publishedVersio

Universidade de Lisboa: Repositório.UL

Repositório Aberto da Universidade Aberta

Constructing a biodiversity terminological inventory.

Author: A Cockburn
A Henriksson
Axel J. Soto
B Boyle
C Carpineto
CD Manning
CS Parr
CW Dunnett
D Koning
D Patterson
E Pafilis
EV Berghe
G Miller
Georgios Kontonatsios
GF Guala
GH Golub
J Bobadilla
J Mitchell
JZ Wang
K Erk
K Frantzi
LM Akella
M Ashburner
M Batet
M Gerner
M Strube
N Gwinn
N Naderi
Nhung T. H. Nguyen
O Bodenreider
O Levy
P Thompson
P Thompson
PD Cantino
PD Turney
PR Leary
R Pivovarov
Riza Batista-Navarro
RL Pyle
Robert Guralnick
S Clark
S Harispe
Sophia Ananiadou
T Mikolov
T Pedersen
T Rees
WE Winkler
WN Lee
WW Cohen
X Wang
Y Bengio
Y Roskov
Y Sasaki
Y Sasaki
Y Sasaki
Y Tsuruoka
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/04/2017
Field of study

The increasing growth of literature in biodiversity presents challenges to users who need to discover pertinent information in an efficient and timely manner. In response, text mining techniques offer solutions by facilitating the automated discovery of knowledge from large textual data. An important step in text mining is the recognition of concepts via their linguistic realisation, i.e., terms. However, a given concept may be referred to in text using various synonyms or term variants, making search systems likely to overlook documents mentioning less known variants, which are albeit relevant to a query term. Domain-specific terminological resources, which include term variants, synonyms and related terms, are thus important in supporting semantic search over large textual archives. This article describes the use of text mining methods for the automatic construction of a large-scale biodiversity term inventory. The inventory consists of names of species, amongst which naming variations are prevalent. We apply a number of distributional semantic techniques on all of the titles in the Biodiversity Heritage Library, to compute semantic similarity between species names and support the automated construction of the resource. With the construction of our biodiversity term inventory, we demonstrate that distributional semantic models are able to identify semantically similar names that are not yet recorded in existing taxonomies. Such methods can thus be used to update existing taxonomies semi-automatically by deriving semantically related taxonomic names from a text corpus and allowing expert curators to validate them. We also evaluate our inventory as a means to improve search by facilitating automatic query expansion. Specifically, we developed a visual search interface that suggests semantically related species names, which are available in our inventory but not always in other repositories, to incorporate into the search query. An assessment of the interface by domain experts reveals that our query expansion based on related names is useful for increasing the number of relevant documents retrieved. Its exploitation can benefit both users and developers of search engines and text mining applications

Crossref

Directory of Open Access Journals

Edge Hill University Research Information Repository

The University of Manchester - Institutional Repository

FigShare

The ORKG R Package and Its Use in Data Science

Author: Boubakri Zied
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität
Publication date: 28/11/2022
Field of study

Research infrastructures and services provide access to (meta)data via user interfaces and APIs. The more advanced services also support access through (Python, R, etc.) packages that users can use in computational environments. For scientific information as a particular kind of research data, the Open Research Knowledge Graph (ORKG) is an example of an advanced service that also supports accessing data from Python scripts. Since many research communities use R as the statistical language of choice, we have developed the ORKG R package to support accessing and processing ORKG data directly from R scripts. Inspired by the Python library, the ORKG R package supports a comparable set of features through a similar programmatic interface. Having developed the ORKG R package, we demonstrate its use in various applications grounded in life science and soil science research fields. As an additional key contribution of this work, we show how the ORKG R package can be used in combination with ORKG templates to support the pre-publication production and publication of machine-readable scientific information, during the data analysis phase of the research life cycle and directly in the scripts that produce scientific information. This new mode of machine-readable scientific information production complements the post-publication Crowdsourcing-based manual and NLP-based automated approaches with the major advantages of unmatched high accuracy and fine granularity

Institutionelles Repositorium der Leibniz Universität Hannover

User-centered semantic dataset retrieval

Author: Löffler Felicitas
Publication venue
Publication date: 01/01/2023
Field of study

Finding relevant research data is an increasingly important but time-consuming task in daily research practice. Several studies report on difficulties in dataset search, e.g., scholars retrieve only partial pertinent data, and important information can not be displayed in the user interface. Overcoming these problems has motivated a number of research efforts in computer science, such as text mining and semantic search. In particular, the emergence of the Semantic Web opens a variety of novel research perspectives. Motivated by these challenges, the overall aim of this work is to analyze the current obstacles in dataset search and to propose and develop a novel semantic dataset search. The studied domain is biodiversity research, a domain that explores the diversity of life, habitats and ecosystems. This thesis has three main contributions: (1) We evaluate the current situation in dataset search in a user study, and we compare a semantic search with a classical keyword search to explore the suitability of semantic web technologies for dataset search. (2) We generate a question corpus and develop an information model to figure out on what scientific topics scholars in biodiversity research are interested in. Moreover, we also analyze the gap between current metadata and scholarly search interests, and we explore whether metadata and user interests match. (3) We propose and develop an improved dataset search based on three components: (A) a text mining pipeline, enriching metadata and queries with semantic categories and URIs, (B) a retrieval component with a semantic index over categories and URIs and (C) a user interface that enables a search within categories and a search including further hierarchical relations. Following user centered design principles, we ensure user involvement in various user studies during the development process

Digitale Bibliothek Thüringen

Library and Information Science Education and eScience: The Current State of ALA Accredited MLS/MLIS Programs in Preparing Librarians and Information Professionals for eScience Needs

Author: Schmillen Hanna
Publication venue: Digital Commons @ DU
Publication date: 01/05/2015
Field of study

The purpose of this study is multifaceted: 1) to describe eScience research in acomprehensive way; 2) to help library and information specialists understand the realm of eScience research and the information needs of the community and demonstrate the importance of LIS professionals within the eScience domain; 3) and to explore the current state of curricular content of ALA accredited MLS/MLIS programs to understand the extent to which they prepare new professionals within eScience librarianship. The literature review focuses heavily on eScientists and other data-driven researchers’ information service needs in addition to demonstrating how and why librarians and information specialists can and should fulfill these service gaps and information needs within eScience research. By looking at the current curriculum of American Library Association (ALA) accredited MLS/MLIS programs, we can identify potential gaps in knowledge and where to improve in order to prepare and train new MLS/MLIS graduates to fulfill the needs of eScientists. This investigation is meant to be informative and can be used as a tool for LIS programs to assess their curriculums in comparison to the needs of eScience and other data-driven and networked research. Finally, this investigation will provide awareness and insight into the services needed to support a thriving eScience and data-driven research community to the LIS profession

University of Denver

Discovery and publishing of primary biodiversity data associated with multimedia resources: The Audubon Core strategies and approaches

Author: Barve Vijay
Carausu Mihail
Chavan Vishwas
Cuadra José
Freeland Chris
Hagedorn Gregor
Leary Patrick
Morris Robert A
Mozzherin Dimitry
Olson Annette
Riccardi Gregory
Teage Ivan
Whitbread Greg
Publication venue: 'The University of Kansas'
Publication date: 01/07/2013
Field of study

The Audubon Core Multimedia Resource Metadata Schema is a representation-free vocabulary for the description of biodiversity multimedia resources and collections, now in the final stages as a proposed Biodiversity Informatics Standards (TDWG) standard. By defining only six terms as mandatory, it seeks to lighten the burden for providing or using multimedia useful for biodiversity science. At the same time it offers rich optional metadata terms that can help curators of multimedia collections provide authoritative media that document species occurrence, ecosystems, identification tools, ontologies, and many other kinds of biodiversity documents or data. About half of the vocabulary is re-used from other relevant controlled vocabularies that are often already in use for multimedia metadata, thereby reducing the mapping burden on existing repositories. A central design goal is to allow consuming applications to have a high likelihood of discovering suitable resources, reducing the human examination effort that might be required to decide if the resource is fit for the purpose of the application

Directory of Open Access Journals

The University of Kansas: Journals@KU

Biodiversity Informatics

Producing Linked Open Dataset from Bibliographic Data with Integration of External Data Sources for Academic Libraries

Author: Das Dr. Rajesh
Saha Biswajit
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 30/11/2020
Field of study

This paper has focused on transformation of bibliographic data to linked open data (LOD) as RDF(Resource Description Framework) triple model with integration of external resources. Library & Information centres and knowledge centres deal with various types of databases like bibliographic databases, full text databases, archival databases, statistical databases, CD/DVD ROM databases and more. Presently, web technology changes storing, processing, and disseminating services rapidly. The semantic web technology is an advance technology of web platform which provides structured data on web for describing and retrieving by the organization or institutions. It may provide more information from other external resources to the users. The main objective of this paper is transformation of library bibliographic data, based on MARC21, to RDF triple format as LOD with enrichment of external LOD dataset. External resources like OpenLibrary, VIAF, Wikidata, DBpedia, GeoNames etc. We have proposed a Workflow model (Figure-1) to visualize details steps, activities, components for transforming bibliographic data to LOD dataset. The methodology of this work includes the various methods and steps for conducting such research work. Here we have used an open source tool OpenRefine (version 3.2), formally it is known as GoogleRefine. The OpenRefine tool is used for managing and organizing the messy data with different attribute like row-column manipulation, reconciliation manipulation, different format manipulation like XML, JSON, N-Triple, RDF etc. The OpenRefine tool has played the various roles for the research work such as insertion of URI column, link generation, reconciliation data for external sources, conversion of source format to RDF format etc. After conversion of whole bibliographic data into RDF triple format as considerable LOD dataset. At the production page we may find a RDF file of bibliographic data. This LOD dataset may further be used by the organizations or institutions for their advanced bibliographic service

DigitalCommons@University of Nebraska