15,107 research outputs found
CP-nets: A Tool for Representing and Reasoning withConditional Ceteris Paribus Preference Statements
Information about user preferences plays a key role in automated decision
making. In many domains it is desirable to assess such preferences in a
qualitative rather than quantitative way. In this paper, we propose a
qualitative graphical representation of preferences that reflects conditional
dependence and independence of preference statements under a ceteris paribus
(all else being equal) interpretation. Such a representation is often compact
and arguably quite natural in many circumstances. We provide a formal semantics
for this model, and describe how the structure of the network can be exploited
in several inference tasks, such as determining whether one outcome dominates
(is preferred to) another, ordering a set outcomes according to the preference
relation, and constructing the best outcome subject to available evidence
Probabilistic abductive logic programming using Dirichlet priors
Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of those models. In this paper, we introduce a probabilistic programming language (PPL) based on abductive logic programming for performing inference in probabilistic models involving categorical distributions with Dirichlet priors. We encode these models as abductive logic programs enriched with probabilistic definitions and queries, and show how to execute and compile them to boolean formulas. Using the latter, we perform generalized inference using one of two proposed Markov Chain Monte Carlo (MCMC) sampling algorithms: an adaptation of uncollapsed Gibbs sampling from related work and a novel collapsed Gibbs sampling (CGS). We show that CGS converges faster than the uncollapsed version on a latent Dirichlet allocation (LDA) task using synthetic data. On similar data, we compare our PPL with LDA-specific algorithms and other PPLs. We find that all methods, except one, perform similarly and that the more expressive the PPL, the slower it is. We illustrate applications of our PPL on real data in two variants of LDA models (Seed and Cluster LDA), and in the repeated insertion model (RIM). In the latter, our PPL yields similar conclusions to inference with EM for Mallows models
Learning what matters - Sampling interesting patterns
In the field of exploratory data mining, local structure in data can be
described by patterns and discovered by mining algorithms. Although many
solutions have been proposed to address the redundancy problems in pattern
mining, most of them either provide succinct pattern sets or take the interests
of the user into account-but not both. Consequently, the analyst has to invest
substantial effort in identifying those patterns that are relevant to her
specific interests and goals. To address this problem, we propose a novel
approach that combines pattern sampling with interactive data mining. In
particular, we introduce the LetSIP algorithm, which builds upon recent
advances in 1) weighted sampling in SAT and 2) learning to rank in interactive
pattern mining. Specifically, it exploits user feedback to directly learn the
parameters of the sampling distribution that represents the user's interests.
We compare the performance of the proposed algorithm to the state-of-the-art in
interactive pattern mining by emulating the interests of a user. The resulting
system allows efficient and interleaved learning and sampling, thus
user-specific anytime data exploration. Finally, LetSIP demonstrates favourable
trade-offs concerning both quality-diversity and exploitation-exploration when
compared to existing methods.Comment: PAKDD 2017, extended versio
Top-k Route Search through Submodularity Modeling of Recurrent POI Features
We consider a practical top-k route search problem: given a collection of
points of interest (POIs) with rated features and traveling costs between POIs,
a user wants to find k routes from a source to a destination and limited in a
cost budget, that maximally match her needs on feature preferences. One
challenge is dealing with the personalized diversity requirement where users
have various trade-off between quantity (the number of POIs with a specified
feature) and variety (the coverage of specified features). Another challenge is
the large scale of the POI map and the great many alternative routes to search.
We model the personalized diversity requirement by the whole class of
submodular functions, and present an optimal solution to the top-k route search
problem through indices for retrieving relevant POIs in both feature and route
spaces and various strategies for pruning the search space using user
preferences and constraints. We also present promising heuristic solutions and
evaluate all the solutions on real life data.Comment: 11 pages, 7 figures, 2 table
- …