Search CORE

228 research outputs found

Federated Query Processing over Heterogeneous Data Sources in a Semantic Data Lake

Author: Endris Kemele M.
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Data provides the basis for emerging scientific and interdisciplinary data-centric applications with the potential of improving the quality of life for citizens. Big Data plays an important role in promoting both manufacturing and scientific development through industrial digitization and emerging interdisciplinary research. Open data initiatives have encouraged the publication of Big Data by exploiting the decentralized nature of the Web, allowing for the availability of heterogeneous data generated and maintained by autonomous data providers. Consequently, the growing volume of data consumed by different applications raise the need for effective data integration approaches able to process a large volume of data that is represented in different format, schema and model, which may also include sensitive data, e.g., financial transactions, medical procedures, or personal data. Data Lakes are composed of heterogeneous data sources in their original format, that reduce the overhead of materialized data integration. Query processing over Data Lakes require the semantic description of data collected from heterogeneous data sources. A Data Lake with such semantic annotations is referred to as a Semantic Data Lake. Transforming Big Data into actionable knowledge demands novel and scalable techniques for enabling not only Big Data ingestion and curation to the Semantic Data Lake, but also for efficient large-scale semantic data integration, exploration, and discovery. Federated query processing techniques utilize source descriptions to find relevant data sources and find efficient execution plan that minimize the total execution time and maximize the completeness of answers. Existing federated query processing engines employ a coarse-grained description model where the semantics encoded in data sources are ignored. Such descriptions may lead to the erroneous selection of data sources for a query and unnecessary retrieval of data, affecting thus the performance of query processing engine. In this thesis, we address the problem of federated query processing against heterogeneous data sources in a Semantic Data Lake. First, we tackle the challenge of knowledge representation and propose a novel source description model, RDF Molecule Templates, that describe knowledge available in a Semantic Data Lake. RDF Molecule Templates (RDF-MTs) describes data sources in terms of an abstract description of entities belonging to the same semantic concept. Then, we propose a technique for data source selection and query decomposition, the MULDER approach, and query planning and optimization techniques, Ontario, that exploit the characteristics of heterogeneous data sources described using RDF-MTs and provide a uniform access to heterogeneous data sources. We then address the challenge of enforcing privacy and access control requirements imposed by data providers. We introduce a privacy-aware federated query technique, BOUNCER, able to enforce privacy and access control regulations during query processing over data sources in a Semantic Data Lake. In particular, BOUNCER exploits RDF-MTs based source descriptions in order to express privacy and access control policies as well as their automatic enforcement during source selection, query decomposition, and planning. Furthermore, BOUNCER implements query decomposition and optimization techniques able to identify query plans over data sources that not only contain the relevant entities to answer a query, but also are regulated by policies that allow for accessing these relevant entities. Finally, we tackle the problem of interest based update propagation and co-evolution of data sources. We present a novel approach for interest-based RDF update propagation that consistently maintains a full or partial replication of large datasets and deal with co-evolution

bonndoc – Der Publikationsserver der Universität Bonn

Recommended from our members

Federated Query Processing

Author: Endris Kemele M.
Graux Damien
Vidal Maria-Esther
Publication venue: Cham : Springer
Publication date: 01/01/2020
Field of study

Big data plays a relevant role in promoting both manufacturing and scientific development through industrial digitization and emerging interdisciplinary research. Semantic web technologies have also experienced great progress, and scientific communities and practitioners have contributed to the problem of big data management with ontological models, controlled vocabularies, linked datasets, data models, query languages, as well as tools for transforming big data into knowledge from which decisions can be made. Despite the significant impact of big data and semantic web technologies, we are entering into a new era where domains like genomics are projected to grow very rapidly in the next decade. In this next era, integrating big data demands novel and scalable tools for enabling not only big data ingestion and curation but also efficient large-scale exploration and discovery. Federated query processing techniques provide a solution to scale up to large volumes of data distributed across multiple data sources. Federated query processing techniques resort to source descriptions to identify relevant data sources for a query, as well as to find efficient execution plans that minimize the total execution time of a query and maximize the completeness of the answers. This chapter summarizes the main characteristics of a federated query engine, reviews the current state of the field, and outlines the problems that still remain open and represent grand challenges for the area

Repositorium für Naturwissenschaften und Technik

SPARQL Query Result Explanation for Linked Data

Author: Endris Kemele M.
Gandon Fabien
Hasan Rakebul
Publication venue: HAL CCSD
Publication date: 19/10/2014
Field of study

International audienceIn this paper, we present an approach to explain SPARQL query results for Linked Data using why-provenance. We present a non-annotation-based algorithm to generate why-provenance and show its feasibility for Linked Data. We present an explanation-aware federated query processor prototype and show the presentation of our explanations. We present a user study to evaluate the impacts of our explanations. Our study shows that our query result explanations are helpful for end users to understand the result derivations and make trust judgments on the results

INRIA a CCSD electronic archive server

HAL Descartes

HAL-Rennes 1

Defining the concept of ‘tick repellency’ in veterinary medicine

Author: A. JOACHIM
A. S. BOWMAN
A. SAINZ
B. CHOMEL
D. OTRANTO
Dryden
Dryden
Endris
Endris
F. BEUGNET
F. JONGEJAN
G. BANETH
H. INOKUMA
Ian
J. GUILLOT
K. PFISTER
L. HALOS
M. FRANC
M. POLLMEIER
R. FARKAS
R. KAUFMAN
R. WALL
Wege
Publication venue: Cambridge University Press
Publication date: 01/01/2012
Field of study

Although widely used, the term repellency needs to be employed with care when applied to ticks and other periodic or permanent ectoparasites. Repellency has classically been used to describe the effects of a substance that causes a flying arthropod to make oriented movements away from its source. However, for crawling arthropods such as ticks, the term commonly subsumes a range of effects that include arthropod irritation and consequent avoiding or leaving the host, failing to attach, to bite, or to feed. The objective of the present article is to highlight the need for clarity, to propose consensus descriptions and methods for the evaluation of various effects on ticks caused by chemical substances

Crossref

PubMed Central

Archivio istituzionale della ricerca - Università di Bari

Explore Bristol Research

Recommended from our members

Preface

Author: Chaves-Fraga
Comerio Marco
David Colpaert Pieter
Endris Kemele M.
Kaffee Lucie-Aimee
Sadeghi Mersedeh
Vidal Maria-Esther
Publication venue: Aachen, Germany : RWTH Aachen
Publication date: 01/01/2019
Field of study

This volumne presents the proceedings of the 1st International Workshop on Approaches for Making Data Interoperable (AMAR 2019) and 1st International Workshop on Semantics for Transport (Sem4Tra) held in Karlsruhe, Germany, September 9, 2019, co-located with SEMANTiCS 2019. Interoperability of data is an important factor to make transportation data accessible, therefore we present the topics alongside each other in this proceedings

Repositorium für Naturwissenschaften und Technik

Atmospheric and oceanic conditions associated with early and late onset for Eastern Africa short rains

Author: Artan G
Atheru Z
Endris HS
Gudoshava M
Hirons L
Segele ZT
Wainwright C
Woolnough S
Publication venue: 'Wiley'
Publication date: 17/03/2022
Field of study

Timing of the rainy season is essential for a number of climate sensitive sectors over Eastern Africa. This is particularly true for the agricultural sector, where most activities depend on both the spatial and temporal distribution of rainfall throughout the season. Using a combination of observational and reanalysis datasets, the present study investigates the atmospheric and oceanic conditions associated with early and late onset for Eastern Africa short rains season (October–December). Our results indicate enhanced rainfall in October and November during years with early onset and rainfall deficit in years with late onset for the same months. Early onset years are found to be associated with warmer sea surface temperatures (SSTs) in the western Indian Ocean, and an enhanced moisture flux and anomalous low-level flow into Eastern Africa from as early as the first dekad of September. The late onset years are characterized by cooler SSTs in the western Indian Ocean, anomalous westerly moisture flux and zonal flow limiting moisture supply to the region. The variability in onset date is separated into the interannual and decadal components, and the links with SSTs and low-level circulation over the Indian Ocean basin are examined separately for both timescales. Significant correlations are found between the interannual variability of the onset and the Indian Ocean dipole mode index. On decadal timescales the onset is shown to be partly driven by the variability of the SSTs over the Indian Ocean. Understanding the influence of these potentially predictable SST and moisture patterns on onset variability has huge potential to improve forecasts of the East African short rains. Improved prediction of the variability of the rainy season onset has huge implications for improving key strategic decisions and preparedness action in many sectors, including agriculture

Online Research @ Cardiff

Spiral - Imperial College Digital Repository

Systemically and cutaneously distributed ectoparasiticides: a review of the efficacy against ticks and fleas on dogs

Author: A Buczek
A Buczek
AA Marchiondo
B Fankhauser
BL Blagburn
BL Blagburn
C Epe
C McMahon
C Meadows
C Navarro
C Wengenmayer
C Zhao
D Otranto
D Otranto
D Otranto
E Ferroglio
E Liénard
F Beugnet
F Beugnet
F Beugnet
FM Walther
FM Walther
FM Walther
FM Walther
FM Walther
FM Walther
GE Bast
H Williams
H Williams
H Williams
IG Horak
Institute of Medicine
J Lüssenhop
J Lüssenhop
J Taenzler
J Taenzler
J Taenzler
JA Lorenz
JA Spencer
JE Casida
JJ Fourie
JJ Fourie
JJ Fourie
K Hellmann
Kurt Pfister
L Cardoso
L Halos
LJ Lawrence
M Asahi
M Elliott
M Elliott
M Franc
M Gassel
M Varloud
M Varloud
M Varloud
MK Faulde
MW Dryden
MW Dryden
MW Dryden
MW Dryden
MW Dryden
MW Dryden
N Achee
N Rohdich
NL Achee
P Deplazes
P Dumont
P Dumont
P Fisara
P García-Reynaga
PV Shah
R Gupta
RG Endris
RG Endris
RG Endris
Rob Armstrong
S Bonneau
S Kilp
S Kilp
SM Ghiasuddin
T Kröber
T Schnieder
TB Coles
TJ Shafer
W Löscher
WL Nicholson
Y Ozoe
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Genomic Characterization of Cholangiocarcinoma in Primary Sclerosing Cholangitis Reveals Therapeutic Opportunities

Background and Aims Lifetime risk of biliary tract cancer (BTC) in primary sclerosing cholangitis (PSC) may exceed 20%, and BTC is currently the leading cause of death in patients with PSC. To open new avenues for management, we aimed to delineate clinically relevant genomic and pathological features of a large panel of PSC-associated BTC (PSC-BTC). Approach and Results We analyzed formalin-fixed, paraffin-embedded tumor tissue from 186 patients with PSC-BTC from 11 centers in eight countries with all anatomical locations included. We performed tumor DNA sequencing at 42 clinically relevant genetic loci to detect mutations, translocations, and copy number variations, along with histomorphological and immunohistochemical characterization. Regardless of the anatomical localization, PSC-BTC exhibited a uniform molecular and histological characteristic similar to extrahepatic cholangiocarcinoma. We detected a high frequency of genomic alterations typical of extrahepatic cholangiocarcinoma, such asTP53(35.5%),KRAS(28.0%),CDKN2A(14.5%), andSMAD4(11.3%), as well as potentially druggable mutations (e.g.,HER2/ERBB2). We found a high frequency of nontypical/nonductal histomorphological subtypes (55.2%) and of the usually rare BTC precursor lesion, intraductal papillary neoplasia (18.3%). Conclusions Genomic alterations in PSC-BTC include a significant number of putative actionable therapeutic targets. Notably, PSC-BTC shows a distinct extrahepatic morpho-molecular phenotype, independent of the anatomical location of the tumor. These findings advance our understanding of PSC-associated cholangiocarcinogenesis and provide strong incentives for clinical trials to test genome-based personalized treatment strategies in PSC-BTC.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

NORA - Norwegian Open Research Archives