24 research outputs found
Complexity Results and Approximation Strategies for MAP Explanations
MAP is the problem of finding a most probable instantiation of a set of
variables given evidence. MAP has always been perceived to be significantly
harder than the related problems of computing the probability of a variable
instantiation Pr, or the problem of computing the most probable explanation
(MPE). This paper investigates the complexity of MAP in Bayesian networks.
Specifically, we show that MAP is complete for NP^PP and provide further
negative complexity results for algorithms based on variable elimination. We
also show that MAP remains hard even when MPE and Pr become easy. For example,
we show that MAP is NP-complete when the networks are restricted to polytrees,
and even then can not be effectively approximated. Given the difficulty of
computing MAP exactly, and the difficulty of approximating MAP while providing
useful guarantees on the resulting approximation, we investigate best effort
approximations. We introduce a generic MAP approximation framework. We provide
two instantiations of the framework; one for networks which are amenable to
exact inference Pr, and one for networks for which even exact inference is too
hard. This allows MAP approximation on networks that are too complex to even
exactly solve the easier problems, Pr and MPE. Experimental results indicate
that using these approximation algorithms provides much better solutions than
standard techniques, and provide accurate MAP estimates in many cases
New Results for the MAP Problem in Bayesian Networks
This paper presents new results for the (partial) maximum a posteriori (MAP)
problem in Bayesian networks, which is the problem of querying the most
probable state configuration of some of the network variables given evidence.
First, it is demonstrated that the problem remains hard even in networks with
very simple topology, such as binary polytrees and simple trees (including the
Naive Bayes structure). Such proofs extend previous complexity results for the
problem. Inapproximability results are also derived in the case of trees if the
number of states per variable is not bounded. Although the problem is shown to
be hard and inapproximable even in very simple scenarios, a new exact algorithm
is described that is empirically fast in networks of bounded treewidth and
bounded number of states per variable. The same algorithm is used as basis of a
Fully Polynomial Time Approximation Scheme for MAP under such assumptions.
Approximation schemes were generally thought to be impossible for this problem,
but we show otherwise for classes of networks that are important in practice.
The algorithms are extensively tested using some well-known networks as well as
random generated cases to show their effectiveness.Comment: A couple of typos were fixed, as well as the notation in part of
section 4, which was misleading. Theoretical and empirical results have not
change
Stochastic Local Search Heuristics for Efficient Feature Selection: An Experimental Study
Feature engineering, including feature selection, plays a key role in data science, knowledge discovery, machine learning, and statistics. Recently, much progress has been made in increasing the accuracy of machine learning for complex problems. In part, this is due to improvements in feature engineering, for example by means of deep learning or feature selection. This progress has, to a large extent, come at the cost of dramatic and perhaps unsustainable increases in the computational resources used. Consequently, there is now a need to emphasize not only accuracy but also computational cost in research on and applications of machine learning including feature selection. With a focus on both the accuracy and computational cost of feature selection, we study stochastic local search (SLS) methods when applied to feature selection in this paper. With an eye to containing computational cost, we consider an SLS method for efficient feature selection, SLS4FS. SLS4FS is an amalgamation of several heuristics, including filter and wrapper methods, controlled by hyperparameters. While SLS4FS admits, for certain hyperparameter settings, analysis by means of homogeneous Markov chains, our focus is on experiments with several realworld datasets in this paper. Our experimental study suggests that SLS4FS is competitive with several existing methods, and is useful in settings where one wants to control the computational cost
Anytime Marginal MAP Inference
This paper presents a new anytime algorithm for the marginal MAP problem in
graphical models. The algorithm is described in detail, its complexity and
convergence rate are studied, and relations to previous theoretical results for
the problem are discussed. It is shown that the algorithm runs in
polynomial-time if the underlying graph of the model has bounded tree-width,
and that it provides guarantees to the lower and upper bounds obtained within a
fixed amount of computational resources. Experiments with both real and
synthetic generated models highlight its main characteristics and show that it
compares favorably against Park and Darwiche's systematic search, particularly
in the case of problems with many MAP variables and moderate tree-width.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012