Search CORE

1,758 research outputs found

Learning-based SPARQL query performance modeling and prediction

Author: A Rajaraman
A Smola
C Chang
DD Lee
G James
H Hotelling
I Jolliffe
J Li
J Pėrez
Kerry Taylor
Lina Yao
NS Altman
Quan Z. Sheng
Wei Emma Zhang
X Wu
Yongrui Qin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/10/2017
Field of study

One of the challenges of managing an RDF database is predicting performance of SPARQL queries before they are executed. Performance characteristics, such as the execution time and memory usage, can help data consumers identify unexpected long-running queries before they start and estimate the system workload for query scheduling. Extensive works address such performance prediction problem in traditional SQL queries but they are not directly applicable to SPARQL queries. In this paper, we adopt machine learning techniques to predict the performance of SPARQL queries. Our work focuses on modeling features of a SPARQL query to a vector representation. Our feature modeling method does not depend on the knowledge of underlying systems and the structure of the underlying data, but only on the nature of SPARQL queries. Then we use these features to train prediction models. We propose a two-step prediction process and consider performances in both cold and warm stages. Evaluations are performed on real world SPRAQL queries, whose execution time ranges from milliseconds to hours. The results demonstrate that the proposed approach can effectively predict SPARQL query performance and outperforms state-of-the-art approaches

Crossref

Adelaide Research & Scholarship

University of Huddersfield Repository

Huddersfield Research Portal

Ontology of core data mining entities

Author: A Bernstein
A Golbraikh
A Karalic
B Smith
B Smith
B Smith
C Silla
C Vens
D Demšar
D Kocev
D Kocev
D Qi
D Young
DJ Hand
F Serban
G Madjarov
G Tsoumakas
GH Bakir
H Mannila
HP Kriegel
I Slavkov
J Vanschoren
K Button
Larisa Soldatova
LN Soldatova
M Courtot
M Ford
M Žáková
MA Avery
MA Avery
MF López
O Spjuth
P Robinson
Panče Panov
Q Yang
R Caruana
R Guha
R Guha
RD King
RD King
RR Brinkman
Sašo Džeroski
T Dietterich
V Podpečan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/07/2014
Field of study

In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

Crossref

Brunel University Research Archive

SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting

Author: Cai Ling
Janowicz Krzysztof
Lao Ni
Mai Gengchen
Regalia Blake
Shi Meilin
Yan Bo
Zhu Rui
Publication venue: 'Wiley'
Publication date: 25/04/2020
Field of study

Learning knowledge graph (KG) embeddings is an emerging technique for a variety of downstream tasks such as summarization, link prediction, information retrieval, and question answering. However, most existing KG embedding models neglect space and, therefore, do not perform well when applied to (geo)spatial data and tasks. For those models that consider space, most of them primarily rely on some notions of distance. These models suffer from higher computational complexity during training while still losing information beyond the relative distance between entities. In this work, we propose a location-aware KG embedding model called SE-KGE. It directly encodes spatial information such as point coordinates or bounding boxes of geographic entities into the KG embedding space. The resulting model is capable of handling different types of spatial reasoning. We also construct a geographic knowledge graph as well as a set of geographic query-answer pairs called DBGeo to evaluate the performance of SE-KGE in comparison to multiple baselines. Evaluation results show that SE-KGE outperforms these baselines on the DBGeo dataset for geographic logic query answering task. This demonstrates the effectiveness of our spatially-explicit model and the importance of considering the scale of different geographic entities. Finally, we introduce a novel downstream task called spatial semantic lifting which links an arbitrary location in the study area to entities in the KG via some relations. Evaluation on DBGeo shows that our model outperforms the baseline by a substantial margin.Comment: Accepted to Transactions in GI

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Correcting Knowledge Base Assertions

Author: Arndt Dörthe
Auer Sören
Chen Jiaoyan
De Melo Gerard
Dimou Anastasia
Lertvittayakumjorn Piyawat
Melo André
Niklaus Christina
Omran Pouya Ghiasnezhad
Trouillon Théo
Vrandečić Denny
Zhang Wen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB

arXiv.org e-Print Archive

City Research Online

Crossref

NIVA Open Access Archive

NORA - Norwegian Open Research Archives

When Things Matter: A Data-Centric View of the Internet of Things

Author: Dustdar Schahram
Falkner Nickolas J. G.
Qin Yongrui
Sheng Quan Z.
Vasilakos Athanasios V.
Wang Hua
Publication venue
Publication date: 01/01/2014
Field of study

With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

arXiv.org e-Print Archive

Victoria University Eprints Repository

Joint Video and Text Parsing for Understanding Events and Answering Queries

Author: Choe Tae Eun
Lee Mun Wai
Meng Meng
Tu Kewei
Zhu Song-Chun
Publication venue
Publication date: 21/02/2014
Field of study

We propose a framework for parsing video and text jointly for understanding events and answering user queries. Our framework produces a parse graph that represents the compositional structures of spatial information (objects and scenes), temporal information (actions and events) and causal information (causalities between events and fluents) in the video and text. The knowledge representation of our framework is based on a spatial-temporal-causal And-Or graph (S/T/C-AOG), which jointly models possible hierarchical compositions of objects, scenes and events as well as their interactions and mutual contexts, and specifies the prior probabilistic distribution of the parse graphs. We present a probabilistic generative model for joint parsing that captures the relations between the input video/text, their corresponding parse graphs and the joint parse graph. Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference. Video parsing and text parsing produce two parse graphs from the input video and text respectively. The joint inference module produces a joint parse graph by performing matching, deduction and revision on the video and text parse graphs. The proposed framework has the following objectives: Firstly, we aim at deep semantic parsing of video and text that goes beyond the traditional bag-of-words approaches; Secondly, we perform parsing and reasoning across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG representation; Thirdly, we show that deep joint parsing facilitates subsequent applications such as generating narrative text descriptions and answering queries in the forms of who, what, when, where and why. We empirically evaluated our system based on comparison against ground-truth as well as accuracy of query answering and obtained satisfactory results

arXiv.org e-Print Archive

CiteSeerX

Documenting Knowledge Graph Embedding and Link Prediction using Knowledge Graphs

Author: Zhao Huaxia
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität
Publication date: 02/02/2024
Field of study

In recent years, sub-symbolic learning, i.e., Knowledge Graph Embedding (KGE) incorporated with Knowledge Graphs (KGs) has gained significant attention in various downstream tasks (e.g., Link Prediction (LP)). These techniques learn a latent vector representation of KG's semantical structure to infer missing links. Nonetheless, the KGE models remain a black box, and the decision-making process behind them is not clear. Thus, the trustability and reliability of the model's outcomes have been challenged. While many state-of-the-art approaches provide data-driven frameworks to address these issues, they do not always provide a complete understanding, and the interpretations are not machine-readable. That is why, in this work, we extend a hybrid interpretable framework, InterpretME, in the field of the KGE models, especially for translation distance models, which include TransE, TransH, TransR, and TransD. The experimental evaluation on various benchmark KGs supports the validity of this approach, which we term Trace KGE. Trace KGE, in particular, contributes to increased interpretability and understanding of the perplexing KGE model's behavior

Institutionelles Repositorium der Leibniz Universität Hannover