13,186 research outputs found
RDF Querying
Reactive Web systems, Web services, and Web-based publish/
subscribe systems communicate events as XML messages, and in
many cases require composite event detection: it is not sufficient to react
to single event messages, but events have to be considered in relation to
other events that are received over time.
Emphasizing language design and formal semantics, we describe the
rule-based query language XChangeEQ for detecting composite events.
XChangeEQ is designed to completely cover and integrate the four complementary
querying dimensions: event data, event composition, temporal
relationships, and event accumulation. Semantics are provided as
model and fixpoint theories; while this is an established approach for rule
languages, it has not been applied for event queries before
Distribution Constraints: The Chase for Distributed Data
This paper introduces a declarative framework to specify and reason about distributions of data over computing nodes in a distributed setting. More specifically, it proposes distribution constraints which are tuple and equality generating dependencies (tgds and egds) extended with node variables ranging over computing nodes. In particular, they can express co-partitioning constraints and constraints about range-based data distributions by using comparison atoms. The main technical contribution is the study of the implication problem of distribution constraints. While implication is undecidable in general, relevant fragments of so-called data-full constraints are exhibited for which the corresponding implication problems are complete for EXPTIME, PSPACE and NP. These results yield bounds on deciding parallel-correctness for conjunctive queries in the presence of distribution constraints
Forward Private Searchable Symmetric Encryption with Optimized I/O Efficiency
Recently, several practical attacks raised serious concerns over the security
of searchable encryption. The attacks have brought emphasis on forward privacy,
which is the key concept behind solutions to the adaptive leakage-exploiting
attacks, and will very likely to become mandatory in the design of new
searchable encryption schemes. For a long time, forward privacy implies
inefficiency and thus most existing searchable encryption schemes do not
support it. Very recently, Bost (CCS 2016) showed that forward privacy can be
obtained without inducing a large communication overhead. However, Bost's
scheme is constructed with a relatively inefficient public key cryptographic
primitive, and has a poor I/O performance. Both of the deficiencies
significantly hinder the practical efficiency of the scheme, and prevent it
from scaling to large data settings. To address the problems, we first present
FAST, which achieves forward privacy and the same communication efficiency as
Bost's scheme, but uses only symmetric cryptographic primitives. We then
present FASTIO, which retains all good properties of FAST, and further improves
I/O efficiency. We implemented the two schemes and compared their performance
with Bost's scheme. The experiment results show that both our schemes are
highly efficient, and FASTIO achieves a much better scalability due to its
optimized I/O
Terminology server for improved resource discovery: analysis of model and functions
This paper considers the potential to improve distributed information retrieval via a terminologies server. The restriction upon effective resource discovery caused by the use of disparate terminologies across services and collections is outlined, before considering a DDC spine based approach involving inter-scheme mapping as a possible solution. The developing HILT model is discussed alongside other existing models and alternative approaches to solving the terminologies problem. Results from the current HILT pilot are presented to illustrate functionality and suggestions are made for further research and development
Balancing clusters to reduce response time variability in large scale image search
Many algorithms for approximate nearest neighbor search in high-dimensional
spaces partition the data into clusters. At query time, in order to avoid
exhaustive search, an index selects the few (or a single) clusters nearest to
the query point. Clusters are often produced by the well-known -means
approach since it has several desirable properties. On the downside, it tends
to produce clusters having quite different cardinalities. Imbalanced clusters
negatively impact both the variance and the expectation of query response
times. This paper proposes to modify -means centroids to produce clusters
with more comparable sizes without sacrificing the desirable properties.
Experiments with a large scale collection of image descriptors show that our
algorithm significantly reduces the variance of response times without
seriously impacting the search quality
Toward Entity-Aware Search
As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability
- …