Search CORE

1,199 research outputs found

Recommending Datasets for Scientific Problem Descriptions

Author: Färber M.
Leisinger A.-K.
Publication venue: Association for Computing Machinery
Publication date: 26/11/2021
Field of study

The steadily rising number of datasets is making it increasingly difficult for researchers and practitioners to be aware of all datasets, particularly of the most relevant datasets for a given research problem. To this end, dataset search engines have been proposed. However, they are based on user\u27s keywords and, thus, have difficulty determining precisely fitting datasets for complex research problems. In this paper, we propose a system that recommends suitable datasets based on a given research problem description. The recommendation task is designed as a domain-specific text classification task. As shown in a comprehensive offline evaluation using various state-of-the-art models, as well as 88,000 paper abstracts and 265,000 citation contexts as research problem descriptions, we obtain an F1-score of 0.75. In an additional user study, we show that users in real-world settings are 88% satisfied in all test cases. We therefore see promising future directions for dataset recommendation

KITopen

CLiT: Combining Linking Techniques for Everyone

Author: Färber M.
Noullet K.
Printz S.
Publication venue: Springer Verlag
Publication date: 12/10/2021
Field of study

KITopen

Which Publications’ Metadata Are in Which Bibliographic Databases? A System for Exploration

Author: Braun C.
Färber M.
Noullet K.
Popovic N.
Saier T.
Publication venue: CEUR-WS.org
Publication date: 08/11/2022
Field of study

The choice of databases containing publications’ metadata (i.e., bibliographic databases) determines the available publication list of any author and, thus, their public appearance and evaluation. Having all publications listed in the various bibliographic databases is therefore important for researchers. However, the average number of publications a researcher publishes per year is steadily rising, making it labor-intensive and time-consuming for authors to investigate whether all their publications are given in all bibliographic databases online. In this paper, we present RefBee, an online system that retrieves the metadata of all publications for a given author from the various bibliographic databases and indicates which publications are missing in which database. Our system is available online at http://refbee.org/ and supports Wikidata, ORCID, Google Scholar, VIAF, DBLP, Dimensions, Microsoft Academic, Semantic Scholar, and DNB/GNB. Our system not only can serve as assistance tool for more than 4.7 million researchers of any discipline and publication’s language, but also incentivizes the usage and population of Wikidata in the scholarly field

KITopen

Applied tracers for the observation of subsurface stormflow at the hillslope scale

Author: Färber A.
Germer K.
Lindenmaier F.
Wienhöfer J.
Zehe E.
Publication venue: Copernicus Publications
Publication date: 14/12/2011
Field of study

Rainfall-runoff response in temperate humid headwater catchments is mainly controlled by hydrological processes at the hillslope scale. Applied tracer experiments with fluorescent dye and salt tracers are well known tools in groundwater studies at the large scale and vadose zone studies at the plot scale, where they provide a means to characterise subsurface flow. We extend this approach to the hillslope scale to investigate saturated and unsaturated flow paths concertedly at a forested hillslope in the Austrian Alps. Dye staining experiments at the plot scale revealed that cracks and soil pipes function as preferential flow paths in the fine-textured soils of the study area, and these preferential flow structures were active in fast subsurface transport of tracers at the hillslope scale. Breakthrough curves obtained under steady flow conditions could be fitted well to a one-dimensional convection-dispersion model. Under natural rainfall a positive correlation of tracer concentrations to the transient flows was observed. The results of this study demonstrate qualitative and quantitative effects of preferential flow features on subsurface stormflow in a temperate humid headwater catchment. It turns out that, at the hillslope scale, the interactions of structures and processes are intrinsically complex, which implies that attempts to model such a hillslope satisfactorily require detailed investigations of effective structures and parameters at the scale of interest

KITopen

Towards Scalable Real-time Analytics:: An Architecture for Scale-out of OLxP Workloads

Author: Auch Nathan
Bodner Thomas
Bumbulis Peter
Färber Franz
Goel Anil K.
Gropengiesser Francis
Lehner Wolfgang
MacLean Scott
Mathis Christian
Pound Jeffrey
Publication venue: 'VLDB Endowment'
Publication date: 10/01/2023
Field of study

We present an overview of our work on the SAP HANA Scale-out Extension, a novel distributed database architecture designed to support large scale analytics over real-time data. This platform permits high performance OLAP with massive scale-out capabilities, while concurrently allowing OLTP workloads. This dual capability enables analytics over real-time changing data and allows fine grained user-specified service level agreements (SLAs) on data freshness. We advocate the decoupling of core database components such as query processing, concurrency control, and persistence, a design choice made possible by advances in high-throughput low-latency networks and storage devices. We provide full ACID guarantees and build on a logical timestamp mechanism to provide MVCC-based snapshot isolation, while not requiring synchronous updates of replicas. Instead, we use asynchronous update propagation guaranteeing consistency with timestamp validation. We provide a view into the design and development of a large scale data management platform for real-time analytics, driven by the needs of modern enterprise customers

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

The OpenCitations Data Model

Author: A Gangemi
A Gómez-Pérez
C Birkle
C Herzog
G Colavizza
G Hendricks
I Heibi
I Heibi
K Aagaard
K Wang
M Färber
MD Wilkinson
S Fortunato
S Peroni
S Peroni
S Peroni
Publication venue
Publication date: 01/01/2020
Field of study

A variety of schemas and ontologies are currently used for the machine-readable description of bibliographic entities and citations. This diversity, and the reuse of the same ontology terms with different nuances, generates inconsistencies in data. Adoption of a single data model would facilitate data integration tasks regardless of the data supplier or context application. In this paper we present the OpenCitations Data Model (OCDM), a generic data model for describing bibliographic entities and citations, developed using Semantic Web technologies. We also evaluate the effective reusability of OCDM according to ontology evaluation practices, mention existing users of OCDM, and discuss the use and impact of OCDM in the wider open science community.Comment: ISWC 2020 Conference proceeding

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server (Univ. Mannheim)

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

International Migration, Integration and Social Cohesion online publications

UvA-DARE

ProofWatch: Watchlist Guidance for Large Theories in E

Author: A Grabowski
C Kaliszyk
C Kaliszyk
C Kaliszyk
C Kaliszyk
D Silver
J Alama
J Jakubův
J Jakubův
J Otten
J Urban
J Urban
JC Blanchette
JC Blanchette
K Slind
L Bachmair
L Kovács
M Färber
M Färber
M Kinyon
R Veroff
S Schulz
S Schulz
S Schulz
S Schulz
T Gauthier
T Gransden
W McCune
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/05/2018
Field of study

Watchlist (also hint list) is a mechanism that allows related proofs to guide a proof search for a new conjecture. This mechanism has been used with the Otter and Prover9 theorem provers, both for interactive formalizations and for human-assisted proving of open conjectures in small theories. In this work we explore the use of watchlists in large theories coming from first-order translations of large ITP libraries, aiming at improving hammer-style automation by smarter internal guidance of the ATP systems. In particular, we (i) design watchlist-based clause evaluation heuristics inside the E ATP system, and (ii) develop new proof guiding algorithms that load many previous proofs inside the ATP and focus the proof search using a dynamically updated notion of proof matching. The methods are evaluated on a large set of problems coming from the Mizar library, showing significant improvement of E's standard portfolio of strategies, and also of the previous best set of strategies invented for Mizar by evolutionary methods.Comment: 19 pages, 10 tables, submitted to ITP 2018 at FLO

arXiv.org e-Print Archive

Crossref

Conservation of core complex subunits shaped the structure and function of photosystem I in the secondary endosymbiont alga Nannochloropsis gaditana

Photosystem I (PSI) is a pigment protein complex catalyzing the light-driven electron transport from plastocyanin to ferredoxin in oxygenic photosynthetic organisms. Several PSI subunits are highly conserved in cyanobacteria, algae and plants, whereas others are distributed differentially in the various organisms. Here we characterized the structural and functional properties of PSI purified from the heterokont alga Nannochloropsis gaditana, showing that it is organized as a supercomplex including a core complex and an outer antenna, as in plants and other eukaryotic algae. Differently from all known organisms, the N. gaditana PSI supercomplex contains five peripheral antenna proteins, identified by proteome analysis as type-R light-harvesting complexes (LHCr4-8). Two antenna subunits are bound in a conserved position, as in PSI in plants, whereas three additional antennae are associated with the core on the other side. This peculiar antenna association correlates with the presence of PsaF/J and the absence of PsaH, G and K in the N. gaditana genome and proteome. Excitation energy transfer in the supercomplex is highly efficient, leading to a very high trapping efficiency as observed in all other PSI eukaryotes, showing that although the supramolecular organization of PSI changed during evolution, fundamental functional properties such as trapping efficiency were maintained

VU Research Portal

Proceedings - University of Groningen

Crossref

University of Groningen

ARTS repository - University of Groningen

PubMed Central

IRIS UniversitÃ Politecnica delle Marche

Archivio istituzionale della ricerca - Università di Padova

Dissertations of the University of Groningen

Canonicalizing Knowledge Base Literals

Author: A Dimou
A Gangemi
A Zaveri
CN Silla
D Fleischhacker
D Krompaß
H Paulheim
H Paulheim
H Paulheim
I Dongo
J Debattista
J Pujara
J Raad
J Sleeman
K Gunaratna
M Färber
S Auer
S Auer
V Efthymiou
X Niu
Z Abedjan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Ontology-based knowledge bases (KBs) like DBpedia are very valuable resources, but their usefulness and usability is limited by various quality issues. One such issue is the use of string literals instead of semantically typed entities. In this paper we study the automated canonicalization of such literals, i.e., replacing the literal with an existing entity from the KB or with a new entity that is typed using classes from the KB. We propose a framework that combines both reasoning and machine learning in order to predict the relevant entities and types, and we evaluate this framework against state-of-the-art baselines for both semantic typing and entity matching

arXiv.org e-Print Archive

City Research Online

Crossref

NORA - Norwegian Open Research Archives

Requirements Analysis for an Open Research Knowledge Graph

Author: A Brack
A Constantin
A Fink
A Hars
A Hoppe
AR Hevner
C Lange
C Okoli
D Vrandečić
G Petasis
IA Klampanos
J Beel
J Lehmann
K Balog
K Degtyarenko
L Bornmann
LN Soldatova
M Färber
M Liakata
M Lubani
M Stocker
MAMA Harris
MY Jaradeh
O Bodenreider
R Braun
S Fathalla
S Mesbah
S Peroni
S Vahdati
V Pertsas
Z Nasar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get an overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KGs) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective by presenting a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications and outline possible solutions.Comment: Accepted for publishing in 24th International Conference on Theory and Practice of Digital Libraries, TPDL 202

arXiv.org e-Print Archive

Crossref

Repositorium für Naturwissenschaften und Technik (TIB Hannover)