8,276 research outputs found
Efficient computational strategies to learn the structure of probabilistic graphical models of cumulative phenomena
Structural learning of Bayesian Networks (BNs) is a NP-hard problem, which is
further complicated by many theoretical issues, such as the I-equivalence among
different structures. In this work, we focus on a specific subclass of BNs,
named Suppes-Bayes Causal Networks (SBCNs), which include specific structural
constraints based on Suppes' probabilistic causation to efficiently model
cumulative phenomena. Here we compare the performance, via extensive
simulations, of various state-of-the-art search strategies, such as local
search techniques and Genetic Algorithms, as well as of distinct regularization
methods. The assessment is performed on a large number of simulated datasets
from topologies with distinct levels of complexity, various sample size and
different rates of errors in the data. Among the main results, we show that the
introduction of Suppes' constraints dramatically improve the inference
accuracy, by reducing the solution space and providing a temporal ordering on
the variables. We also report on trade-offs among different search techniques
that can be efficiently employed in distinct experimental settings. This
manuscript is an extended version of the paper "Structural Learning of
Probabilistic Graphical Models of Cumulative Phenomena" presented at the 2018
International Conference on Computational Science
Physical Data Independence, Constraints and Optimization with Universal Plans
We present an optimization method and al gorithm designed for three objectives: physi cal data independence, semantic optimization, and generalized tableau minimization. The method relies on generalized forms of chase and backchase with constraints (dependen cies). By using dictionaries (finite functions) in physical schemas we can capture with con straints useful access structures such as indexes, materialized views, source capabilities, access support relations, gmaps, etc. The search space for query plans is defined and enumerated in a novel manner: the chase phase rewrites the original query into a universal plan that integrates all the access structures and alternative pathways that are allowed by appli cable constraints. Then, the backchase phase produces optimal plans by eliminating various combinations of redundancies, again according to constraints. This method is applicable (sound) to a large class of queries, physical access structures, and semantic constraints. We prove that it is in fact complete for path-conjunctive queries and views with complex objects, classes and dictio naries, going beyond previous theoretical work on processing queries using materialized views
Constraining the Search Space in Temporal Pattern Mining
Agents in dynamic environments have to deal with complex situations including various temporal interrelations of actions and events. Discovering frequent patterns in such scenes can be useful in order to create prediction rules which can be used to predict future activities or situations. We present the algorithm MiTemP which learns frequent patterns based on a time intervalbased relational representation. Additionally the problem has also been transfered to a pure relational association rule mining task which can be handled by WARMR. The two approaches are compared in a number of experiments. The experiments show the advantage of avoiding the creation of impossible or redundant patterns with MiTemP. While less patterns have to be explored on average with MiTemP more frequent patterns are found at an earlier refinement level
Should Optimal Designers Worry About Consideration?
Consideration set formation using non-compensatory screening rules is a vital
component of real purchasing decisions with decades of experimental validation.
Marketers have recently developed statistical methods that can estimate
quantitative choice models that include consideration set formation via
non-compensatory screening rules. But is capturing consideration within models
of choice important for design? This paper reports on a simulation study of a
vehicle portfolio design when households screen over vehicle body style built
to explore the importance of capturing consideration rules for optimal
designers. We generate synthetic market share data, fit a variety of discrete
choice models to the data, and then optimize design decisions using the
estimated models. Model predictive power, design "error", and profitability
relative to ideal profits are compared as the amount of market data available
increases. We find that even when estimated compensatory models provide
relatively good predictive accuracy, they can lead to sub-optimal design
decisions when the population uses consideration behavior; convergence of
compensatory models to non-compensatory behavior is likely to require
unrealistic amounts of data; and modeling heterogeneity in non-compensatory
screening is more valuable than heterogeneity in compensatory trade-offs. This
supports the claim that designers should carefully identify consideration
behaviors before optimizing product portfolios. We also find that higher model
predictive power does not necessarily imply better design decisions; that is,
different model forms can provide "descriptive" rather than "predictive"
information that is useful for design.Comment: 5 figures, 26 pages. In Press at ASME Journal of Mechanical Design
(as of 3/17/15
From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back
In this work we establish and investigate connections between causes for
query answers in databases, database repairs wrt. denial constraints, and
consistency-based diagnosis. The first two are relatively new research areas in
databases, and the third one is an established subject in knowledge
representation. We show how to obtain database repairs from causes, and the
other way around. Causality problems are formulated as diagnosis problems, and
the diagnoses provide causes and their responsibilities. The vast body of
research on database repairs can be applied to the newer problems of computing
actual causes for query answers and their responsibilities. These connections,
which are interesting per se, allow us, after a transition -inspired by
consistency-based diagnosis- to computational problems on hitting sets and
vertex covers in hypergraphs, to obtain several new algorithmic and complexity
results for database causality.Comment: To appear in Theory of Computing Systems. By invitation to special
issue with extended papers from ICDT 2015 (paper arXiv:1412.4311
Query Rewriting and Optimization for Ontological Databases
Ontological queries are evaluated against a knowledge base consisting of an
extensional database and an ontology (i.e., a set of logical assertions and
constraints which derive new intensional knowledge from the extensional
database), rather than directly on the extensional database. The evaluation and
optimization of such queries is an intriguing new problem for database
research. In this paper, we discuss two important aspects of this problem:
query rewriting and query optimization. Query rewriting consists of the
compilation of an ontological query into an equivalent first-order query
against the underlying extensional database. We present a novel query rewriting
algorithm for rather general types of ontological constraints which is
well-suited for practical implementations. In particular, we show how a
conjunctive query against a knowledge base, expressed using linear and sticky
existential rules, that is, members of the recently introduced Datalog+/-
family of ontology languages, can be compiled into a union of conjunctive
queries (UCQ) against the underlying database. Ontological query optimization,
in this context, attempts to improve this rewriting process so to produce
possibly small and cost-effective UCQ rewritings for an input query.Comment: arXiv admin note: text overlap with arXiv:1312.5914 by other author
Computing Storyline Visualizations with Few Block Crossings
Storyline visualizations show the structure of a story, by depicting the
interactions of the characters over time. Each character is represented by an
x-monotone curve from left to right, and a meeting is represented by having the
curves of the participating characters run close together for some time. There
have been various approaches to drawing storyline visualizations in an
automated way. In order to keep the visual complexity low, rather than
minimizing pairwise crossings of curves, we count block crossings, that is,
pairs of intersecting bundles of lines.
Partly inspired by the ILP-based approach of Gronemann et al. [GD 2016] for
minimizing the number of pairwise crossings, we model the problem as a
satisfiability problem (since the straightforward ILP formulation becomes more
complicated and harder to solve). Having restricted ourselves to a decision
problem, we can apply powerful SAT solvers to find optimal drawings in
reasonable time. We compare this SAT-based approach with two exact algorithms
for block crossing minimization, using both the benchmark instances of
Gronemann et al. and random instances. We show that the SAT approach is
suitable for real-world instances and identify cases where the other algorithms
are preferable.Comment: Appears in the Proceedings of the 25th International Symposium on
Graph Drawing and Network Visualization (GD 2017
- …