19,383 research outputs found
Infinite Probabilistic Databases
Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open-world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009).
We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics.
It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries
Spontaneous Analogy by Piggybacking on a Perceptual System
Most computational models of analogy assume they are given a delineated
source domain and often a specified target domain. These systems do not address
how analogs can be isolated from large domains and spontaneously retrieved from
long-term memory, a process we call spontaneous analogy. We present a system
that represents relational structures as feature bags. Using this
representation, our system leverages perceptual algorithms to automatically
create an ontology of relational structures and to efficiently retrieve analogs
for new relational structures from long-term memory. We provide a demonstration
of our approach that takes a set of unsegmented stories, constructs an ontology
of analogical schemas (corresponding to plot devices), and uses this ontology
to efficiently find analogs within new stories, yielding significant
time-savings over linear analog retrieval at a small accuracy cost.Comment: Proceedings of the 35th Meeting of the Cognitive Science Society,
201
Scalable Querying of Nested Data
While large-scale distributed data processing platforms have become an attractive target for query processing, these systems are problematic for applications that deal with nested collections. Programmers are forced either to perform non-trivial translations of collection programs or to employ automated flattening procedures, both of which lead to performance problems. These challenges only worsen for nested collections with skewed cardinalities, where both handcrafted rewriting and automated flattening are unable to enforce load balancing across partitions.
In this work, we propose a framework that translates a program manipulating nested collections into a set of semantically equivalent shredded queries that can be efficiently evaluated. The framework employs a combination of query compilation techniques, an efficient data representation for nested collections, and automated skew-handling. We provide an extensive experimental evaluation, demonstrating significant improvements provided by the framework in diverse scenarios for nested collection programs
The fermion bag approach to lattice field theories
We propose a new approach to the fermion sign problem in systems where there
is a coupling such that when it is infinite the fermions are paired into
bosons and there is no fermion permutation sign to worry about. We argue that
as becomes finite fermions are liberated but are naturally confined to
regions which we refer to as {\em fermion bags}. The fermion sign problem is
then confined to these bags and may be solved using the determinantal trick. In
the parameter regime where the fermion bags are small and their typical size
does not grow with the system size, construction of Monte Carlo methods that
are far more efficient than conventional algorithms should be possible. In the
region where the fermion bags grow with system size, the fermion bag approach
continues to provide an alternative approach to the problem but may lose its
main advantage in terms of efficiency. The fermion bag approach also provides
new insights and solutions to sign problems. A natural solution to the "silver
blaze problem" also emerges. Using the three dimensional massless lattice
Thirring model as an example we introduce the fermion bag approach and
demonstrate some of these features. We compute the critical exponents at the
quantum phase transition and find and .Comment: 31 pages, 9 figures, 5 table
A Semantics-Based Approach to Design of Query Languages for Partial Information
Most of work on partial information in databases asks which operations of standard languages, like relational algebra, can still be performed correctly in the presence of nulls. In this paper a different point of view is advocated. We believe that the semantics of partiality must be clearly understood and it should give us new design principles for languages for databases with partial information. There are different sources of partial information, such as missing information and conflicts that occur when different databases are merged. In this paper, we develop a common semantic framework for them which can be applied in a context more general than the flat relational model. This ordered semantics, which is based on ideas used in the semantics of programming languages, cleanly intergrates all kinds of partial information and serves as a tool to establish connections between them. Analyzing properties of semantic domains of types suitable for representing partial information, we come up with operations that are naturally associated with those types, and we organize programming syntax around these operations. We show how the languages that we obtain can be used to ask typical queries about incomplete information in relational databases, and how they can express some previously proposed languages. Finally, we discuss a few related topics such as mixing traditional constraints with partial information and extending semantics and languages to accommodate bags and recursive types
- …