13,451 research outputs found
A simple polynomial time algorithm to approximate the permanent within a simply exponential factor
We present a simple randomized polynomial time algorithm to approximate the
mixed discriminant of positive semidefinite matrices within a
factor . Consequently, the algorithm allows us to approximate in
randomized polynomial time the permanent of a given non-negative
matrix within a factor . When applied to approximating the permanent,
the algorithm turns out to be a simple modification of the well-known
Godsil-Gutman estimator
A Geometric Variational Approach to Bayesian Inference
We propose a novel Riemannian geometric framework for variational inference
in Bayesian models based on the nonparametric Fisher-Rao metric on the manifold
of probability density functions. Under the square-root density representation,
the manifold can be identified with the positive orthant of the unit
hypersphere in L2, and the Fisher-Rao metric reduces to the standard L2 metric.
Exploiting such a Riemannian structure, we formulate the task of approximating
the posterior distribution as a variational problem on the hypersphere based on
the alpha-divergence. This provides a tighter lower bound on the marginal
distribution when compared to, and a corresponding upper bound unavailable
with, approaches based on the Kullback-Leibler divergence. We propose a novel
gradient-based algorithm for the variational problem based on Frechet
derivative operators motivated by the geometry of the Hilbert sphere, and
examine its properties. Through simulations and real-data applications, we
demonstrate the utility of the proposed geometric framework and algorithm on
several Bayesian models
New Results for the MAP Problem in Bayesian Networks
This paper presents new results for the (partial) maximum a posteriori (MAP)
problem in Bayesian networks, which is the problem of querying the most
probable state configuration of some of the network variables given evidence.
First, it is demonstrated that the problem remains hard even in networks with
very simple topology, such as binary polytrees and simple trees (including the
Naive Bayes structure). Such proofs extend previous complexity results for the
problem. Inapproximability results are also derived in the case of trees if the
number of states per variable is not bounded. Although the problem is shown to
be hard and inapproximable even in very simple scenarios, a new exact algorithm
is described that is empirically fast in networks of bounded treewidth and
bounded number of states per variable. The same algorithm is used as basis of a
Fully Polynomial Time Approximation Scheme for MAP under such assumptions.
Approximation schemes were generally thought to be impossible for this problem,
but we show otherwise for classes of networks that are important in practice.
The algorithms are extensively tested using some well-known networks as well as
random generated cases to show their effectiveness.Comment: A couple of typos were fixed, as well as the notation in part of
section 4, which was misleading. Theoretical and empirical results have not
change
Revisiting Kernelized Locality-Sensitive Hashing for Improved Large-Scale Image Retrieval
We present a simple but powerful reinterpretation of kernelized
locality-sensitive hashing (KLSH), a general and popular method developed in
the vision community for performing approximate nearest-neighbor searches in an
arbitrary reproducing kernel Hilbert space (RKHS). Our new perspective is based
on viewing the steps of the KLSH algorithm in an appropriately projected space,
and has several key theoretical and practical benefits. First, it eliminates
the problematic conceptual difficulties that are present in the existing
motivation of KLSH. Second, it yields the first formal retrieval performance
bounds for KLSH. Third, our analysis reveals two techniques for boosting the
empirical performance of KLSH. We evaluate these extensions on several
large-scale benchmark image retrieval data sets, and show that our analysis
leads to improved recall performance of at least 12%, and sometimes much
higher, over the standard KLSH method.Comment: 15 page
Semi-automatic selection of summary statistics for ABC model choice
A central statistical goal is to choose between alternative explanatory
models of data. In many modern applications, such as population genetics, it is
not possible to apply standard methods based on evaluating the likelihood
functions of the models, as these are numerically intractable. Approximate
Bayesian computation (ABC) is a commonly used alternative for such situations.
ABC simulates data x for many parameter values under each model, which is
compared to the observed data xobs. More weight is placed on models under which
S(x) is close to S(xobs), where S maps data to a vector of summary statistics.
Previous work has shown the choice of S is crucial to the efficiency and
accuracy of ABC. This paper provides a method to select good summary statistics
for model choice. It uses a preliminary step, simulating many x values from all
models and fitting regressions to this with the model as response. The
resulting model weight estimators are used as S in an ABC analysis. Theoretical
results are given to justify this as approximating low dimensional sufficient
statistics. A substantive application is presented: choosing between competing
coalescent models of demographic growth for Campylobacter jejuni in New Zealand
using multi-locus sequence typing data
- …