Search CORE

110 research outputs found

Maximum likelihood estimation of a multivariate log-concave density

Author: Cule Madeleine
Samworth R
Stewart M
Publication venue: University of Cambridge
Publication date: 24/04/2008
Field of study

Density estimation is a fundamental statistical problem. Many methods are either sensitive to model misspecification (parametric models) or difficult to calibrate, especially for multivariate data (nonparametric smoothing methods). We propose an alternative approach using maximum likelihood under a qualitative assumption on the shape of the density, specifically log-concavity. The class of log-concave densities includes many common parametric families and has desirable properties. For univariate data, these estimators are relatively well understood, and are gaining in popularity in theory and practice. We discuss extensions for multivariate data, which require different techniques. After establishing existence and uniqueness of the log-concave maximum likelihood estimator for multivariate data, we see that a reformulation allows us to compute it using standard convex optimization techniques. Unlike kernel density estimation, or other nonparametric smoothing methods, this is a fully automatic procedure, and no additional tuning parameters are required. Since the assumption of log-concavity is non-trivial, we introduce a method for assessing the suitability of this shape constraint and apply it to several simulated datasets and one real dataset. Density estimation is often one stage in a more complicated statistical procedure. With this in mind, we show how the estimator may be used for plug-in estimation of statistical functionals. A second important extension is the use of log-concave components in mixture models. We illustrate how we may use an EM-style algorithm to fit mixture models where the number of components is known. Applications to visualization and classification are presented. In the latter case, improvement over a Gaussian mixture model is demonstrated. Performance for density estimation is evaluated in two ways. Firstly, we consider Hellinger convergence (the usual metric of theoretical convergence results for nonparametric maximum likelihood estimators). We prove consistency with respect to this metric and heuristically discuss rates of convergence and model misspecification, supported by empirical investigation. Secondly, we use the mean integrated squared error to demonstrate favourable performance compared with kernel density estimates using a variety of bandwidth selectors, including sophisticated adaptive methods. Throughout, we emphasise the development of stable numerical procedures able to handle the additional complexity of multivariate data

Southampton (e-Prints Soton)

Oxford University Research Archive

Apollo (Cambridge)

OpenGrey Repository

Spacecraft Trajectory Optimization Suite (STOpS): Optimization of Multiple Gravity Assist Spacecraft Trajectories Using Modern Optimization Techniques

Author: Fitzgerald Timothy J.
Publication venue: DigitalCommons@CalPoly
Publication date: 01/12/2015
Field of study

In trajectory optimization, a common objective is to minimize propellant mass via multiple gravity assist maneuvers (MGAs). Some computer programs have been developed to analyze MGA trajectories. One of these programs, Parallel Global Multiobjective Optimization (PaGMO), uses an interesting technique known as the Island Model Paradigm. This work provides the community with a MATLAB optimizer, STOpS, that utilizes this same Island Model Paradigm with five different optimization algorithms. STOpS allows optimization of a weighted combination of many parameters. This work contains a study on optimization algorithm performance and how each algorithm is affected by its available settings. STOpS successfully found optimal trajectories for the Mariner 10 mission and the Voyager 2 mission that were similar to the actual missions flown. STOpS did not necessarily find better trajectories than those actually flown, but instead demonstrated the capability to quickly and successfully analyze/plan trajectories. The analysis for each of these missions took 2-3 days each. The final program is a robust tool that has taken existing techniques and applied them to the specific problem of trajectory optimization, so it can repeatedly and reliably solve these types of problems

DigitalCommons@CalPoly

Energy Minimization

Author: Gautam Budhayash
Publication venue: 'IntechOpen'
Publication date: 18/11/2020
Field of study

The energetic state of a protein is one of the most important representative parameters of its stability. The energy of a protein can be defined as a function of its atomic coordinates. This energy function consists of several components: 1. Bond energy and angle energy, representative of the covalent bonds, bond angles. 2. Dihedral energy, due to the dihedral angles. 3. A van der Waals term (also called Leonard-Jones potential) to ensure that atoms do not have steric clashes. 4. Electrostatic energy accounting for the Coulomb’s Law m protein structure, i.e. the long-range forces between charged and partially charged atoms. All these quantitative terms have been parameterized and are collectively referred to as the ‘force-field’, for e.g. CHARMM, AMBER, AMBERJOPLS and GROMOS. The goal of energy Minimization is to find a set of coordinates representing the minimum energy conformation for the given structure. Various algorithms have been formulated by varying the use of derivatives. Three common algorithms used for this optimization are steepest descent, conjugate gradient and Newton–Raphson. Although energy Minimization is a tool to achieve the nearest local minima, it is also an indispensable tool in correcting structural anomalies, viz. bad stereo-chemistry and short contacts. An efficient optimization protocol could be devised from these methods in conjunction with a larger space exploration algorithm, e.g. molecular dynamics

IntechOpen

Crossref

Computer code for controller partitioning with IFPC application: A user's manual

Author: Schmidt Phillip H.
Yarkhan Asim
Publication venue
Publication date
Field of study

A user's manual for the computer code for partitioning a centralized controller into decentralized subcontrollers with applicability to Integrated Flight/Propulsion Control (IFPC) is presented. Partitioning of a centralized controller into two subcontrollers is described and the algorithm on which the code is based is discussed. The algorithm uses parameter optimization of a cost function which is described. The major data structures and functions are described. Specific instructions are given. The user is led through an example of an IFCP application

NASA Technical Reports Server

Bayesian Inference for Multivariate Monotone Densities

Author: Ghosal Subhashis
Wang Kang
Publication venue
Publication date: 08/06/2023
Field of study

We consider a nonparametric Bayesian approach to estimation and testing for a multivariate monotone density. Instead of following the conventional Bayesian route of putting a prior distribution complying with the monotonicity restriction, we put a prior on the step heights through binning and a Dirichlet distribution. An arbitrary piece-wise constant probability density is converted to a monotone one by a projection map, taking its

\mathbb{L}_1

-projection onto the space of monotone functions, which is subsequently normalized to integrate to one. We construct consistent Bayesian tests to test multivariate monotonicity of a probability density based on the

\mathbb{L}_1

-distance to the class of monotone functions. The test is shown to have a size going to zero and high power against alternatives sufficiently separated from the null hypothesis. To obtain a Bayesian credible interval for the value of the density function at an interior point with guaranteed asymptotic frequentist coverage, we consider a posterior quantile interval of an induced map transforming the function value to its value optimized over certain blocks. The limiting coverage is explicitly calculated and is seen to be higher than the credibility level used in the construction. By exploring the asymptotic relationship between the coverage and the credibility, we show that a desired asymptomatic coverage can be obtained exactly by starting with an appropriate credibility level

arXiv.org e-Print Archive

Optimization Methods for Training Feedforward Neural Networks

Author: Tio Hendra
Publication venue: 'Oklahoma State University Library'
Publication date: 01/12/1996
Field of study

SHAREOK repository

A Taylor polynomial expansion line search for large-scale optimization

Author: Hynes Michael`
Publication venue: 'University of Waterloo'
Publication date: 10/08/2016
Field of study

In trying to cope with the Big Data deluge, the landscape of distributed computing has changed. Large commodity hardware clusters, typically operating in some form of MapReduce framework, are becoming prevalent for organizations that require both tremendous storage capacity and fault tolerance. However, the high cost of communication can dominate the computation time in large-scale optimization routines in these frameworks. This thesis considers the problem of how to efficiently conduct univariate line searches in commodity clusters in the context of gradient-based batch optimization algorithms, like the staple limited-memory BFGS (LBFGS) method. In it, a new line search technique is proposed for cases where the underlying objective function is analytic, as in logistic regression and low rank matrix factorization. The technique approximates the objective function by a truncated Taylor polynomial along a fixed search direction. The coefficients of this polynomial may be computed efficiently in parallel with far less communication than needed to transmit the high-dimensional gradient vector, after which the polynomial may be minimized with high accuracy in a neighbourhood of the expansion point without distributed operations. This Polynomial Expansion Line Search (PELS) may be invoked iteratively until the expansion point and minimum are sufficiently accurate, and can provide substantial savings in time and communication costs when multiple iterations in the line search procedure are required. Three applications of the PELS technique are presented herein for important classes of analytic functions: (i) logistic regression (LR), (ii) low-rank matrix factorization (MF) models, and (iii) the feedforward multilayer perceptron (MLP). In addition, for LR and MF, implementations of PELS in the Apache Spark framework for fault-tolerant cluster computing are provided. These implementations conferred significant convergence enhancements to their respective algorithms, and will be of interest to Spark and Hadoop practitioners. For instance, the Spark PELS technique reduced the number of iterations and time required by LBFGS to reach terminal training accuracies for LR models by factors of 1.8--2. Substantial acceleration was also observed for the Nonlinear Conjugate Gradient algorithm for MLP models, which is an interesting case for future study in optimization for neural networks. The PELS technique is applicable to a broad class of models for Big Data processing and large-scale optimization, and can be a useful component of batch optimization routines

University of Waterloo's Institutional Repository

Essays in Microeconometrics

Author: Martin Stephan
Publication venue: Humboldt-Universität zu Berlin
Publication date: 23/08/2023
Field of study

Diese Dissertation umfasst drei Aufsätze zu verschiedenen Themen aus dem Bereich der Mikroökonometrie. Das erste Kapitel ist eine gemeinsame Arbeit mit Christoph Breunig und umfasst semi/nichtparametrische Regressionsmodelle, in denen die abhängige Variable einen nicht-klassischen Messfehler aufweist. Es werden Bedingungen erarbeitet, unter denen die Regressionsfunktion bis auf eine Normalisierung identifiziert werden kann. Zur Schätzung wird ein neuer Schätzer entwickelt, bei dem eine Rang-basierte Kriteriumsfunktion über einen sieve-Raum optimiert wird und dessen Konvergenzrate hergeleitet. Das zweite Kapitel beschäftigt sich mit der Schätzung von bedingten Dichtefunktionen von zufälligen Koeffizienten in linearen Regressionsmodellen. Es wird ein zweistufiges Schätzverfahren entwickelt, in dem zunächst eine Approximation der bedingten Dichte Koeffizienten hergeleitet wird. In einem weiteren Schritt können diese Funktionen mit generischen Methoden des maschinellen Lernens geschätzt werden. Des Weiteren wird auch die Konvergenzrate des Schätzers in der L2-Norm hergeleitet sowie dessen punktweise, asymptotische Normalität. Im dritten Kapitel wird ein neuer und einfach umsetzbarer Ansatz zur Schätzung semi(nicht)parametrischer diskreter Entscheidungsmodelle, unter Berücksichtigung von Restriktionen auf die funktionalen Parameter des Modells, vorgestellt. Die untersuchten Modelle weisen funktionale Parameter auf, die bestimmte funktionale Formen aufweisen. Zentraler Teil der Arbeit ist die Entwicklung eines GLS-Schätzers über einen geeigneten sieve-Raum, der aus I- und B-Spline Basisfunktionen unter geeigneten Restriktionen basiert. Es wird gezeigt, dass sich die Berücksichtigung der Restriktionen auf die funktionale Form positiv auf die Konvergenzrate des Schätzers in einer schwachen Norm auswirkt und so notwendige Bedingungen für die asymptotische Normalität semiparametrischer Schätzer einfacher erreichen lässt.This dissertation comprises three individual papers on various topics in microeconometrics. In the first chapter, which is joint work with Christoph Breunig, we study a semi-/nonparametric regression model with a general form of nonclassical measurement error in the outcome variable. We provide conditions under which the regression function is identifiable under appropriate normalizations. We propose a novel sieve rank estimator for the regression function and establish its rate of convergence. The second chapter deals with the estimation of conditional random coefficient models. Here I propose a two-stage sieve estimation procedure. First, a closed-form sieve approximation of the conditional RC density is derived. Second, sieve coefficients are estimated with generic machine learning procedures and under appropriate sample splitting rules. I derive the

L_2

-convergence rate of the conditional RC-density estimator and also provide a result on pointwise asymptotic normality. The third chapter presents a novel and simple approach to estimating a class of semi(non)parametric discrete choice models imposing shape constraints on the infinite-dimensional and unknown link function parameter. I study multiple-index discrete choice models where the link function is known to be bounded between zero and one and is (partly) monotonic. In the paper I present an easy to implement and computationally efficient sieve GLS estimation approach using a sieve space of constrained I- and B-spline basis functions. The estimator is shown to be consistent and that imposing shape constraints speeds up the convergence rate of the estimator in a weak Fisher-like norm. The asymptotic normality of relevant smooth functionals of model parameters is derived and I illustrate that necessary assumptions are milder if shape constraints are imposed

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Matrix Nearness Problems with Bregman Divergences

Author: Banerjee A.
Banerjee A.
Bauschke H. H.
Bregman L. M.
Inderjit S. Dhillon
Joel A. Tropp
Kruithof R.
Lewis A. S.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date
Field of study

Crossref