Search CORE

18 research outputs found

EFFECTIVELY SEARCHING SPECIMEN AND OBSERVATION DATA WITH TOQE, THE THESAURUS OPTIMIZED QUERY EXPANDER

Author: Berendsohn Walter G.
Güntsch Anton
Hoffmann Niels
Kelbert Patricia
Publication venue: 'The University of Kansas'
Publication date: 01/09/2009
Field of study

Today’s specimen and observation data portals lack a flexible mechanism, able to link up thesaurus-enabled data sources such as taxonomic checklist databases and expand user queries to related terms, significantly enhancing result sets. The TOQE system (Thesaurus Optimized Query Expander) is a REST-like XML web-service implemented in Python and designed for this purpose. Acting as an interface between portals and thesauri, TOQE allows the implementation of specialized portal systems with a set of thesauri supporting its specific focus. It is both easy to use for portal programmers and easy to configure for thesaurus database holders who want to expose their system as a service for query expansions. Currently, TOQE is used in four specimen and observation data portals. The documentation is available from http://search.biocase.org/toqe/

Crossref

Directory of Open Access Journals

The University of Kansas: Journals@KU

Biodiversity Informatics

Badgers: generating data quality deficits with Python

Author: Kelbert Patricia
Kläs Michael
Seifert Daniel
Siebert Julien
Trendowicz Adam
Publication venue
Publication date: 10/07/2023
Field of study

Generating context specific data quality deficits is necessary to experimentally assess data quality of data-driven (artificial intelligence (AI) or machine learning (ML)) applications. In this paper we present badgers, an extensible open-source Python library to generate data quality deficits (outliers, imbalanced data, drift, etc.) for different modalities (tabular data, time-series, text, etc.). The documentation is accessible at https://fraunhofer-iese.github.io/badgers/ and the source code at https://github.com/Fraunhofer-IESE/badgersComment: 17 pages, 16 figure

arXiv.org e-Print Archive

Enriched biodiversity data as a resource and service

Background: Recent years have seen a surge in projects that produce large volumes of structured, machine-readable biodiversity data. To make these data amenable to processing by generic, open source “data enrichment” workflows, they are increasingly being represented in a variety of standards-compliant interchange formats. Here, we report on an initiative in which software developers and taxonomists came together to address the challenges and highlight the opportunities in the enrichment of such biodiversity data by engaging in intensive, collaborative software development: The Biodiversity Data Enrichment Hackathon. Results: The hackathon brought together 37 participants (including developers and taxonomists, i.e. scientific professionals that gather, identify, name and classify species) from 10 countries: Belgium, Bulgaria, Canada, Finland, Germany, Italy, the Netherlands, New Zealand, the UK, and the US. The participants brought expertise in processing structured data, text mining, development of ontologies, digital identification keys, geographic information systems, niche modeling, natural language processing, provenance annotation, semantic integration, taxonomic name resolution, web service interfaces, workflow tools and visualisation. Most use cases and exemplar data were provided by taxonomists. One goal of the meeting was to facilitate re-use and enhancement of biodiversity knowledge by a broad range of stakeholders, such as taxonomists, systematists, ecologists, niche modelers, informaticians and ontologists. The suggested use cases resulted in nine breakout groups addressing three main themes: i) mobilising heritage biodiversity knowledge; ii) formalising and linking concepts; and iii) addressing interoperability between service platforms. Another goal was to further foster a community of experts in biodiversity informatics and to build human links between research projects and institutions, in response to recent calls to further such integration in this research domain. Conclusions: Beyond deriving prototype solutions for each use case, areas of inadequacy were discussed and are being pursued further. It was striking how many possible applications for biodiversity data there were and how quickly solutions could be put together when the normal constraints to collaboration were broken down for a week. Conversely, mobilising biodiversity knowledge from their silos in heritage literature and natural history collections will continue to require formalisation of the concepts (and the links between them) that define the research domain, as well as increased interoperability between the software platforms that operate on these concepts

British Library (BL) Shared Research Repository

Crossref

ZENODO

Directory of Open Access Journals

Open Research Online

PubMed Central

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Enlighten

The University of Manchester - Institutional Repository

ARPHA OAI-PMH Endpoint

ARPHA Preprints

A common, automated, pre-publication registration model for higher plants (International Plant Names Index, IPNI), fungi (Index Fungorum, MycoBank) and animals (ZooBank)

Author: Patricia Kelbert (452067)
Pavel Stoev (452073)
Publication venue
Publication date
Field of study

A common, automated, pre-publication registration model for higher plants (International Plant Names Index, IPNI), fungi (Index Fungorum, MycoBank) and animals (ZooBank) [pdf, 650 KB] Use the reference from http://dx.doi.org/10.6084/m9.figshare.784947 *only*</p

FigShare

Interoperability model between PLAZI and the CDM Platform

Author: Anton Güntsch (442638)
Donat Agosti (452068)
Guido Sautter (442640)
Patricia Kelbert (452067)
Publication venue
Publication date
Field of study

Interoperability model between PLAZI and the CDM Platform.</p

FigShare

Methods used in the development of Common Data Models for health data – A Scoping Review Protocol

Author: Jannik Schaaf
Markus Wolfien
Martin Sedlmayr
Michele Zoch
Najia Ahmadi
Patricia Kelbert
Richard Noll
Publication venue: OSF
Publication date: 06/02/2023
Field of study

Common Data Models (CDMs) are essential tools for data harmonization, which can lead to significant improvements in healthcare. CDMs harmonize data from disparate sources and eases collaborations across institutions which lead to generation of larger standardized data repositories across different entities. This Scoping Review (Sc-R) on methods used in the development of CDMs for healthcare aims to obtain a broad overview of approaches that are used in developing CDMs, i.e., Common Data Elements (CDEs) or Common Data Sets (CDS) for different disease domains on an international level. To get an overview of the state-of-the-art literature databases, namely PubMed, Web of Science, Science Direct, and Scopus are searched for five-year publications, starting from 2017, with associated keywords. The included articles will be evaluated methodically and a list of different types of methods will be created. The methods will then be categorized into groups

OSF Preprints

Beschäftigtendatenschutz: Rechtliche Anforderungen und Technische Lösungskonzepte

Author: Bosse Christian K.
Dietrich Aljoscha
Kelbert Patricia
Küchler Hagen
Schmitt Hartmut
Tolsdorf Jan
Weßner Andreas
Publication venue
Publication date: 04/03/2020
Field of study

pub H-BRS - Publikationsserver der Hochschule Bonn-Rhein-Sieg

Tracking biogeographical change from its footprints in botanical literature

Author: Anton Güntsch (442638)
Guido Sautter (442640)
Patricia Kelbert (442639)
Quentin Groom (417610)
Sabrina Eckert (442637)
Publication venue
Publication date
Field of study

Early results from an investigation into the usefulness of botanical literature to provide historical information on the distributions of plants. Based upon the case of Chenopodium vulvaria, a small weed of waste places.</p

FigShare

B-HIT - A Tool for Harvesting and Indexing Biodiversity Data.

Author: Anton Güntsch
E Margaret Cawsey
Gabriele Droege
Jamie Whitacre
Jonathan Coddington
Katharine Barker
Kyle Braak
Patricia Kelbert
Tim Robertson
Publication venue: Public Library of Science (PLoS)
Publication date: 01/01/2015
Field of study

With the rapidly growing number of data publishers, the process of harvesting and indexing information to offer advanced search and discovery becomes a critical bottleneck in globally distributed primary biodiversity data infrastructures. The Global Biodiversity Information Facility (GBIF) implemented a Harvesting and Indexing Toolkit (HIT), which largely automates data harvesting activities for hundreds of collection and observational data providers. The team of the Botanic Garden and Botanical Museum Berlin-Dahlem has extended this well-established system with a range of additional functions, including improved processing of multiple taxon identifications, the ability to represent associations between specimen and observation units, new data quality control and new reporting capabilities. The open source software B-HIT can be freely installed and used for setting up thematic networks serving the demands of particular user groups

Institutional Repository of the Freie Universität Berlin

Directory of Open Access Journals

PubMed Central

HIV-PDI: A Protein Drug Interaction Resource for Structural Analyses of HIV Drug Resistance: 2. Examples of Use and Proof-of-Concept

Author: Devignes Marie-Dominique
Djikeng Appolinaire
Fokam Joseph
Ghemtio Leo
Kelbert Patricia
Keminse Lionel
Maigret Bernard
Ouwe-Missi-Oukem-Boyer Odile
Smaïl-Tabbone Malika
Publication venue: 'OMICS Publishing Group'
Publication date: 01/01/2011
Field of study

International audienceThe HIV-PDI resource was designed and implemented to address the problems of drug resistance with a central focus on the 3D structure of the target-drug interaction. Clinical and biological data, structural and physico-chemical information and 3D interaction data concerning the targets (HIV protease) and the drugs (ARVs) were meticulously integrated and combined with tools dedicated to study HIV mutations and their consequences on the efficacy of drugs. Here, the capabilities of the HIV-PDI resource are demonstrated for several different scenarios ranging from retrieving information associated with patients to analyzing structural data relating cognate proteins and ligands. HIV-PDI allows such diverse data to be correlated, especially data linking antiretroviral drug (ARV) resistance to a given treatment with changes in three-dimensional interactions between a drug molecule and the mutated protease. Our work is based on the assumption that ARV resistance results from a loss of affinity between the mutated HIV protease and a drug molecule due to subtle changes in the nature of the protein-ligand interaction. Therefore, a set of patients whose resistance to first line treatment was corrected by a second line treatment was selected from the HIV-PDI database for detailed study, and several queries regarding these patients are processed via its graphical user interface. Considering the protease mutations found in the selected set of patients, our retrospective analysis was able to establish in most cases that the first line treatment was not suitable, and it predicted a second line treatment which agreed perfectly with the clincian's prescription. The present study demonstrates the capabilities of HIV-PDI. We anticipate that this decision support tool will help clinicians and researchers find suitable HIV treatments for individual patients. The HIVPDI database is thereby useful as a system of data collection allowing interpretation on the basis of all available information, thus helping in possible decision-makings

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot