3,270 research outputs found
AMaχoS—Abstract Machine for Xcerpt
Web query languages promise convenient and efficient access
to Web data such as XML, RDF, or Topic Maps. Xcerpt is one such Web
query language with strong emphasis on novel high-level constructs for
effective and convenient query authoring, particularly tailored to versatile
access to data in different Web formats such as XML or RDF.
However, so far it lacks an efficient implementation to supplement the
convenient language features. AMaχoS is an abstract machine implementation
for Xcerpt that aims at efficiency and ease of deployment. It
strictly separates compilation and execution of queries: Queries are compiled
once to abstract machine code that consists in (1) a code segment
with instructions for evaluating each rule and (2) a hint segment that
provides the abstract machine with optimization hints derived by the
query compilation. This article summarizes the motivation and principles
behind AMaχoS and discusses how its current architecture realizes
these principles
Data queries over heterogeneous sources
Dissertação para obtenção do Grau de Mestre em
Engenharia InformáticaEnterprises typically have their data spread over many software systems, such as
custom made applications, CRM systems like SalesForce, CMS systems, or ERP systems
like SAP. In these setting, it is often desired to integrate information from many data
sources to accomplish some business goal in an application. Data may be stored locally
or in the cloud in a wide variety of ways, demanding for explicit transformation processes
to be defined, reason why it is hard for developers to integrate it. Moreover, the amount
of external data can be large and the difference of efficiency between a smart and a naive
way of retrieving and filtering data from different locations can be great. Hence, it is
clear that developers would benefit greatly from language abstractions to help them build
queries over heterogeneous data sources and from an optimization process that avoids
large and unnecessary data transfers during the execution of queries.
This project was developed at OutSystems and aims at extending a real product, which
makes it even more challenging. We followed a generic approach that can be implemented
in any framework, not only focused on the product of OutSystems
Neural Architecture for Question Answering Using a Knowledge Graph and Web Corpus
In Web search, entity-seeking queries often trigger a special Question
Answering (QA) system. It may use a parser to interpret the question to a
structured query, execute that on a knowledge graph (KG), and return direct
entity responses. QA systems based on precise parsing tend to be brittle: minor
syntax variations may dramatically change the response. Moreover, KG coverage
is patchy. At the other extreme, a large corpus may provide broader coverage,
but in an unstructured, unreliable form. We present AQQUCN, a QA system that
gracefully combines KG and corpus evidence. AQQUCN accepts a broad spectrum of
query syntax, between well-formed questions to short `telegraphic' keyword
sequences. In the face of inherent query ambiguities, AQQUCN aggregates signals
from KGs and large corpora to directly rank KG entities, rather than commit to
one semantic interpretation of the query. AQQUCN models the ideal
interpretation as an unobservable or latent variable. Interpretations and
candidate entity responses are scored as pairs, by combining signals from
multiple convolutional networks that operate collectively on the query, KG and
corpus. On four public query workloads, amounting to over 8,000 queries with
diverse query syntax, we see 5--16% absolute improvement in mean average
precision (MAP), compared to the entity ranking performance of recent systems.
Our system is also competitive at entity set retrieval, almost doubling F1
scores for challenging short queries.Comment: Accepted to Information Retrieval Journa
MICSIM : Concept, Developments and Applications of a PC-Microsimulation Model for Research and Teaching
It is the growing societal interest about the individual and its behaviour in our and 'modern' societies which is asking for microanalyses about the individual situation. In order to allow these microanalyses on a quantitative and empirically based level microsimulation models were developed and increasingly used for economic and social policy impact analyses. Though microsimulation is known and applied (mainly by experts), an easy to use and powerful PC microsimulation model is hard to find. The overall aim of this study and of MICSIM - A PC Microsimulation Model is to describe and offer such a user-friendly and powerful general microsimulation model for (almost) any PC, to support the impact microanalyses both in applied research and teaching. Above all, MICSIM is a general microdata handler for a wide range of typical microanalysis requirements. This paper presents the concept, developments and applications of MICSIM. After some brief remarks on microsimulation characteristics in general, the concept and substantive domains of MICSIM: the simulation, the adjustment and aging, and the evaluation of microdata, are described by its mode of operation in principle. The realisations and developments of MICSIM then are portrayed by the different versions of the computer program. Some MICSIM applications and experiences in research and teaching are following with concluding remarks.Economic and Social Policy Analyses, Microsimulation (dynamic and static), Simulation, Adjustment and Evaluation of Microdata, PC Computer Program for Microanalyses in General
Physical Representation-based Predicate Optimization for a Visual Analytics Database
Querying the content of images, video, and other non-textual data sources
requires expensive content extraction methods. Modern extraction techniques are
based on deep convolutional neural networks (CNNs) and can classify objects
within images with astounding accuracy. Unfortunately, these methods are slow:
processing a single image can take about 10 milliseconds on modern GPU-based
hardware. As massive video libraries become ubiquitous, running a content-based
query over millions of video frames is prohibitive.
One promising approach to reduce the runtime cost of queries of visual
content is to use a hierarchical model, such as a cascade, where simple cases
are handled by an inexpensive classifier. Prior work has sought to design
cascades that optimize the computational cost of inference by, for example,
using smaller CNNs. However, we observe that there are critical factors besides
the inference time that dramatically impact the overall query time. Notably, by
treating the physical representation of the input image as part of our query
optimization---that is, by including image transforms, such as resolution
scaling or color-depth reduction, within the cascade---we can optimize data
handling costs and enable drastically more efficient classifier cascades.
In this paper, we propose Tahoma, which generates and evaluates many
potential classifier cascades that jointly optimize the CNN architecture and
input data representation. Our experiments on a subset of ImageNet show that
Tahoma's input transformations speed up cascades by up to 35 times. We also
find up to a 98x speedup over the ResNet50 classifier with no loss in accuracy,
and a 280x speedup if some accuracy is sacrificed.Comment: Camera-ready version of the paper submitted to ICDE 2019, In
Proceedings of the 35th IEEE International Conference on Data Engineering
(ICDE 2019
GignoMDA
Database Systems are often used as persistent layer for applications. This implies that database schemas are generated out of transient programming class descriptions. The basic idea of the MDA approach generalizes this principle by providing a framework to generate applications (and database schemas) for different programming platforms. Within our GignoMDA project [3]--which is subject of this demo proposal--we have extended classic concepts for code generation. That means, our approach provides a single point of truth describing all aspects of database applications (e.g. database schema, project documentation,...) with great potential for cross-layer optimization. These new cross-layer optimization hints are a novel way for the challenging global optimization issue of multi-tier database applications. The demo at VLDB comprises an in-depth explanation of our concepts and the prototypical implementation by directly demonstrating the modeling and the automatic generation of database applications
- …