501 research outputs found
Worst-case Optimal Submodular Extensions for Marginal Estimation
Submodular extensions of an energy function can be used to efficiently
compute approximate marginals via variational inference. The accuracy of the
marginals depends crucially on the quality of the submodular extension. To
identify the best possible extension, we show an equivalence between the
submodular extensions of the energy and the objective functions of linear
programming (LP) relaxations for the corresponding MAP estimation problem. This
allows us to (i) establish the worst-case optimality of the submodular
extension for Potts model used in the literature; (ii) identify the worst-case
optimal submodular extension for the more general class of metric labeling; and
(iii) efficiently compute the marginals for the widely used dense CRF model
with the help of a recently proposed Gaussian filtering method. Using synthetic
and real data, we show that our approach provides comparable upper bounds on
the log-partition function to those obtained using tree-reweighted message
passing (TRW) in cases where the latter is computationally feasible.
Importantly, unlike TRW, our approach provides the first practical algorithm to
compute an upper bound on the dense CRF model.Comment: Accepted to AISTATS 201
Higher-order inference in conditional random fields using submodular functions
Higher-order and dense conditional random fields (CRFs) are expressive graphical
models which have been very successful in low-level computer vision applications
such as semantic segmentation, and stereo matching. These models are able to
capture long-range interactions and higher-order image statistics much better
than pairwise CRFs. This expressive power comes at a price though - inference
problems in these models are computationally very demanding. This is a
particular challenge in computer vision, where fast inference is important and
the problem involves millions of pixels.
In this thesis, we look at how submodular functions can help us designing
efficient inference methods for higher-order and dense CRFs. Submodular
functions are special discrete functions that have important properties from
an optimisation perspective, and are closely related to convex functions. We
use submodularity in a two-fold manner: (a) to design efficient MAP inference
algorithm for a robust higher-order model that generalises the widely-used
truncated convex models, and (b) to glean insights into a recently proposed
variational inference algorithm which give us a principled approach for applying
it efficiently to higher-order and dense CRFs
Near-Optimal Sensor Scheduling for Batch State Estimation: Complexity, Algorithms, and Limits
In this paper, we focus on batch state estimation for linear systems. This
problem is important in applications such as environmental field estimation,
robotic navigation, and target tracking. Its difficulty lies on that limited
operational resources among the sensors, e.g., shared communication bandwidth
or battery power, constrain the number of sensors that can be active at each
measurement step. As a result, sensor scheduling algorithms must be employed.
Notwithstanding, current sensor scheduling algorithms for batch state
estimation scale poorly with the system size and the time horizon. In addition,
current sensor scheduling algorithms for Kalman filtering, although they scale
better, provide no performance guarantees or approximation bounds for the
minimization of the batch state estimation error. In this paper, one of our
main contributions is to provide an algorithm that enjoys both the estimation
accuracy of the batch state scheduling algorithms and the low time complexity
of the Kalman filtering scheduling algorithms. In particular: 1) our algorithm
is near-optimal: it achieves a solution up to a multiplicative factor 1/2 from
the optimal solution, and this factor is close to the best approximation factor
1/e one can achieve in polynomial time for this problem; 2) our algorithm has
(polynomial) time complexity that is not only lower than that of the current
algorithms for batch state estimation; it is also lower than, or similar to,
that of the current algorithms for Kalman filtering. We achieve these results
by proving two properties for our batch state estimation error metric, which
quantifies the square error of the minimum variance linear estimator of the
batch state vector: a) it is supermodular in the choice of the sensors; b) it
has a sparsity pattern (it involves matrices that are block tri-diagonal) that
facilitates its evaluation at each sensor set.Comment: Correction of typos in proof
Submodularity in Action: From Machine Learning to Signal Processing Applications
Submodularity is a discrete domain functional property that can be
interpreted as mimicking the role of the well-known convexity/concavity
properties in the continuous domain. Submodular functions exhibit strong
structure that lead to efficient optimization algorithms with provable
near-optimality guarantees. These characteristics, namely, efficiency and
provable performance bounds, are of particular interest for signal processing
(SP) and machine learning (ML) practitioners as a variety of discrete
optimization problems are encountered in a wide range of applications.
Conventionally, two general approaches exist to solve discrete problems:
relaxation into the continuous domain to obtain an approximate solution, or
development of a tailored algorithm that applies directly in the
discrete domain. In both approaches, worst-case performance guarantees are
often hard to establish. Furthermore, they are often complex, thus not
practical for large-scale problems. In this paper, we show how certain
scenarios lend themselves to exploiting submodularity so as to construct
scalable solutions with provable worst-case performance guarantees. We
introduce a variety of submodular-friendly applications, and elucidate the
relation of submodularity to convexity and concavity which enables efficient
optimization. With a mixture of theory and practice, we present different
flavors of submodularity accompanying illustrative real-world case studies from
modern SP and ML. In all cases, optimization algorithms are presented, along
with hints on how optimality guarantees can be established
Sample Complexity Bounds for Influence Maximization
Influence maximization (IM) is the problem of finding for a given s ? 1 a set S of |S|=s nodes in a network with maximum influence. With stochastic diffusion models, the influence of a set S of seed nodes is defined as the expectation of its reachability over simulations, where each simulation specifies a deterministic reachability function. Two well-studied special cases are the Independent Cascade (IC) and the Linear Threshold (LT) models of Kempe, Kleinberg, and Tardos [Kempe et al., 2003]. The influence function in stochastic diffusion is unbiasedly estimated by averaging reachability values over i.i.d. simulations. We study the IM sample complexity: the number of simulations needed to determine a (1-?)-approximate maximizer with confidence 1-?. Our main result is a surprising upper bound of O(s ? ?^{-2} ln (n/?)) for a broad class of models that includes IC and LT models and their mixtures, where n is the number of nodes and ? is the number of diffusion steps. Generally ? ? n, so this significantly improves over the generic upper bound of O(s n ?^{-2} ln (n/?)). Our sample complexity bounds are derived from novel upper bounds on the variance of the reachability that allow for small relative error for influential sets and additive error when influence is small. Moreover, we provide a data-adaptive method that can detect and utilize fewer simulations on models where it suffices. Finally, we provide an efficient greedy design that computes an (1-1/e-?)-approximate maximizer from simulations and applies to any submodular stochastic diffusion model that satisfies the variance bounds
Greedy Maximization Framework for Graph-based Influence Functions
The study of graph-based submodular maximization problems was initiated in a
seminal work of Kempe, Kleinberg, and Tardos (2003): An {\em influence}
function of subsets of nodes is defined by the graph structure and the aim is
to find subsets of seed nodes with (approximately) optimal tradeoff of size and
influence. Applications include viral marketing, monitoring, and active
learning of node labels. This powerful formulation was studied for
(generalized) {\em coverage} functions, where the influence of a seed set on a
node is the maximum utility of a seed item to the node, and for pairwise {\em
utility} based on reachability, distances, or reverse ranks.
We define a rich class of influence functions which unifies and extends
previous work beyond coverage functions and specific utility functions. We
present a meta-algorithm for approximate greedy maximization with strong
approximation quality guarantees and worst-case near-linear computation for all
functions in our class. Our meta-algorithm generalizes a recent design by Cohen
et al (2014) that was specific for distance-based coverage functions.Comment: 8 pages, 1 figur
Structured sparsity-inducing norms through submodular functions
Sparse methods for supervised learning aim at finding good linear predictors
from as few variables as possible, i.e., with small cardinality of their
supports. This combinatorial selection problem is often turned into a convex
optimization problem by replacing the cardinality function by its convex
envelope (tightest convex lower bound), in this case the L1-norm. In this
paper, we investigate more general set-functions than the cardinality, that may
incorporate prior knowledge or structural constraints which are common in many
applications: namely, we show that for nondecreasing submodular set-functions,
the corresponding convex envelope can be obtained from its \lova extension, a
common tool in submodular analysis. This defines a family of polyhedral norms,
for which we provide generic algorithmic tools (subgradients and proximal
operators) and theoretical results (conditions for support recovery or
high-dimensional inference). By selecting specific submodular functions, we can
give a new interpretation to known norms, such as those based on
rank-statistics or grouped norms with potentially overlapping groups; we also
define new norms, in particular ones that can be used as non-factorial priors
for supervised learning
- …