71 research outputs found
Global optimization algorithms for image registration and clustering
Global optimization is a classical problem of finding the minimum or maximum value of an objective function. It has applications in many areas, such as biological image analysis, chemistry, mechanical engineering, financial analysis, deep learning and image processing. For practical applications, it is important to understand the efficiency of global optimization algorithms. This dissertation develops and analyzes some new global optimization algorithms and applies them to practical problems, mainly for image registration and data clustering.
First, the dissertation presents a new global optimization algorithm which approximates the optimum using only function values. The basic idea is to use the points at which the function has been evaluated to decompose the domain into a collection of hyper-rectangles. At each step of the algorithm, it chooses a hyper-rectangle according to a certain criterion and the next function evaluation is at the center of the hyper-rectangle. The dissertation includes a proof that the algorithm converges to the global optimum as the number of function evaluations goes to infinity, and also establishes the convergence rate. Standard test functions are used to experimentally evaluate the algorithm.
The second part focuses on applying algorithms from the first part to solve some practical problems. Image processing tasks often require optimizing over some set of parameters. In the image registration problem, one attempts to determine the best transformation for aligning similar images. Such problems typically require minimizing a dissimilarity measure with multiple local minima. The dissertation describes a global optimization algorithm and applies it to the problem of identifying the best transformation for aligning two images.
Global optimization algorithms can also be applied to the data clustering problem. The basic purpose of clustering is to categorize data into different groups by their similarity. The objective cost functions for clustering usually are non-convex. -means is a popular algorithm which can find local optima quickly but may not obtain global optima. The different starting points for -means can output different local optima. This dissertation describes a global optimization algorithm for approximating the global minimum of the clustering problem.
The third part of the dissertation presents variations of the proposed algorithm that work with different assumptions on the available information, including a version that uses derivatives
Universal Consistency of Decision Trees in High Dimensions
This paper shows that decision trees constructed with Classification and
Regression Trees (CART) methodology are universally consistent in an additive
model context, even when the number of predictor variables scales exponentially
with the sample size, under certain -norm sparsity constraints. The
consistency is universal in the sense that there are no a priori assumptions on
the distribution of the predictor variables. Amazingly, this adaptivity to
(approximate or exact) sparsity is achieved with a single tree, as opposed to
what might be expected for an ensemble. Finally, we show that these qualitative
properties of individual trees are inherited by Breiman's random forests.
Another surprise is that consistency holds even when the "mtry" tuning
parameter vanishes as a fraction of the number of predictor variables, thus
speeding up computation of the forest. A key step in the analysis is the
establishment of an oracle inequality, which precisely characterizes the
goodness-of-fit and complexity tradeoff for a misspecified model
Solving, Estimating and Selecting Nonlinear Dynamic Economic Models without the Curse of Dimensionality
A welfare analysis of a risky policy is impossible within a linear or linearized model and its certainty equivalence property. The presented algorithms are designed as a toolbox for a general model class. The computational challenges are considerable and I concentrate on the numerics and statistics for a simple model of dynamic consumption and labor choice. I calculate the optimal policy and estimate the posterior density of structural parameters and the marginal likelihood within a nonlinear state space model. My approach is even in an interpreted language twenty time faster than the only alternative compiled approach. The model is estimated on simulated data in order to test the routines against known true parameters. The policy function is approximated by Smolyak Chebyshev polynomials and the rational expectation integral by Smolyak Gaussian quadrature. The Smolyak operator is used to extend univariate approximation and integration operators to many dimensions. It reduces the curse of dimensionality from exponential to polynomial growth. The likelihood integrals are evaluated by a Gaussian quadrature and Gaussian quadrature particle filter. The bootstrap or sequential importance resampling particle filter is used as an accuracy benchmark. The posterior is estimated by the Gaussian filter and a Metropolis- Hastings algorithm. I propose a genetic extension of the standard Metropolis-Hastings algorithm by parallel random walk sequences. This improves the robustness of start values and the global maximization properties. Moreover it simplifies a cluster implementation and the random walk variances decision is reduced to only two parameters so that almost no trial sequences are needed. Finally the marginal likelihood is calculated as a criterion for nonnested and quasi-true models in order to select between the nonlinear estimates and a first order perturbation solution combined with the Kalman filter.stochastic dynamic general equilibrium model, Chebyshev polynomials, Smolyak operator, nonlinear state space filter, Curse of Dimensionality, posterior of structural parameters, marginal likelihood
MCMC-driven learning
This paper is intended to appear as a chapter for the Handbook of Markov
Chain Monte Carlo. The goal of this chapter is to unify various problems at the
intersection of Markov chain Monte Carlo (MCMC) and machine
learning\unicode{x2014}which includes black-box variational inference,
adaptive MCMC, normalizing flow construction and transport-assisted MCMC,
surrogate-likelihood MCMC, coreset construction for MCMC with big data, Markov
chain gradient descent, Markovian score climbing, and
more\unicode{x2014}within one common framework. By doing so, the theory and
methods developed for each may be translated and generalized
SLOPE - Adaptive variable selection via convex optimization
We introduce a new estimator for the vector of coefficients in the
linear model , where has dimensions with
possibly larger than . SLOPE, short for Sorted L-One Penalized Estimation,
is the solution to where
and are the
decreasing absolute values of the entries of . This is a convex program and
we demonstrate a solution algorithm whose computational complexity is roughly
comparable to that of classical procedures such as the Lasso. Here,
the regularizer is a sorted norm, which penalizes the regression
coefficients according to their rank: the higher the rank - that is, stronger
the signal - the larger the penalty. This is similar to the Benjamini and
Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] procedure (BH) which
compares more significant -values with more stringent thresholds. One
notable choice of the sequence is given by the BH critical
values , where and
is the quantile of a standard normal distribution. SLOPE aims to
provide finite sample guarantees on the selected model; of special interest is
the false discovery rate (FDR), defined as the expected proportion of
irrelevant regressors among all selected predictors. Under orthogonal designs,
SLOPE with provably controls FDR at level .
Moreover, it also appears to have appreciable inferential properties under more
general designs while having substantial power, as demonstrated in a series
of experiments running on both simulated and real data.Comment: Published at http://dx.doi.org/10.1214/15-AOAS842 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
ESTIMATION-BASED SOLUTIONS TO INCOMPLETE INFORMATION PURSUIT-EVASION GAMES
Differential games are a useful tool both for modeling conflict between autonomous systems and for synthesizing robust control solutions. The traditional study of games has assumed decision agents possess complete information about one another’s strategies and numerical weights. This dissertation relaxes this assumption. Instead, uncertainty in the opponent’s strategy is treated as a symptom of the inevitable gap between modeling assumptions and applications. By combining nonlinear estimation approaches with problem domain knowledge, procedures are developed for acting under uncertainty using established methods that are suitable for applications on embedded systems. The dissertation begins by using nonlinear estimation to account for parametric uncertainty in an opponent’s strategy. A solution is proposed for engagements in which both players use this approach simultaneously. This method is demonstrated on a numerical example of an orbital pursuit-evasion game, and the findings motivate additional developments. First, the solutions of the governing Riccati differential equations are approximated, using automatic differentiation to obtain high-degree Taylor series approximations. Second, constrained estimation is introduced to prevent estimator failures in near-singular engagements. Numerical conditions for nonsingularity are approximated using Chebyshev polynomial basis functions, and applied as constraints to a state estimate. Third and finally, multiple model estimation is suggested as a practical solution for time-critical engagements in which the form of the opponent’s strategy is uncertain. Deceptive opponent strategies are identified as a candidate approach to use against an adaptive player, and a procedure for designing such strategies is proposed. The new developments are demonstrated in a missile interception pursuit-evasion game in which the evader selects from a set of candidate strategies with unknown weights
- …