25,176 research outputs found
Linear Programming with Inequality Constraints via Entropic Perturbation
A dual convex programming approach to solving linear programs with inequality constraints through entropic perturbation is derived. The amount of perturbation required depends on the desired accuracy of the optimum. The dual program contains only non-positivity constraints. An ϵ-optimal solution to the linear program can be obtained effortlessly from the optimal solution of the dual program. Since cross-entropy minimization subject to linear inequality constraints is a special case of the perturbed linear program, the duality result becomes readily applicable. Many standard constrained optimization techniques can be specialized to solve the dual program. Such specializations, made possible by the simplicity of the constraints, significantly reduce the computational effort usually incurred by these methods. Immediate applications of the theory developed include an entropic path-following approach to solving linear semi-infinite programs with an infinite number of inequality constraints and the widely used entropy optimization models with linear inequality and/or equality constraints
Bethe Projections for Non-Local Inference
Many inference problems in structured prediction are naturally solved by
augmenting a tractable dependency structure with complex, non-local auxiliary
objectives. This includes the mean field family of variational inference
algorithms, soft- or hard-constrained inference using Lagrangian relaxation or
linear programming, collective graphical models, and forms of semi-supervised
learning such as posterior regularization. We present a method to
discriminatively learn broad families of inference objectives, capturing
powerful non-local statistics of the latent variables, while maintaining
tractable and provably fast inference using non-Euclidean projected gradient
descent with a distance-generating function given by the Bethe entropy. We
demonstrate the performance and flexibility of our method by (1) extracting
structured citations from research papers by learning soft global constraints,
(2) achieving state-of-the-art results on a widely-used handwriting recognition
task using a novel learned non-convex inference procedure, and (3) providing a
fast and highly scalable algorithm for the challenging problem of inference in
a collective graphical model applied to bird migration.Comment: minor bug fix to appendix. appeared in UAI 201
Cluster Variation Method in Statistical Physics and Probabilistic Graphical Models
The cluster variation method (CVM) is a hierarchy of approximate variational
techniques for discrete (Ising--like) models in equilibrium statistical
mechanics, improving on the mean--field approximation and the Bethe--Peierls
approximation, which can be regarded as the lowest level of the CVM. In recent
years it has been applied both in statistical physics and to inference and
optimization problems formulated in terms of probabilistic graphical models.
The foundations of the CVM are briefly reviewed, and the relations with
similar techniques are discussed. The main properties of the method are
considered, with emphasis on its exactness for particular models and on its
asymptotic properties.
The problem of the minimization of the variational free energy, which arises
in the CVM, is also addressed, and recent results about both provably
convergent and message-passing algorithms are discussed.Comment: 36 pages, 17 figure
Entropy balancing is doubly robust
Covariate balance is a conventional key diagnostic for methods used
estimating causal effects from observational studies. Recently, there is an
emerging interest in directly incorporating covariate balance in the
estimation. We study a recently proposed entropy maximization method called
Entropy Balancing (EB), which exactly matches the covariate moments for the
different experimental groups in its optimization problem. We show EB is doubly
robust with respect to linear outcome regression and logistic propensity score
regression, and it reaches the asymptotic semiparametric variance bound when
both regressions are correctly specified. This is surprising to us because
there is no attempt to model the outcome or the treatment assignment in the
original proposal of EB. Our theoretical results and simulations suggest that
EB is a very appealing alternative to the conventional weighting estimators
that estimate the propensity score by maximum likelihood.Comment: 23 pages, 6 figures, Journal of Causal Inference 201
- …