Search CORE

49,540 research outputs found

LASSO ISOtone for High Dimensional Additive Isotonic Regression

Author: Fang Zhou
Meinshausen Nicolai
Publication venue
Publication date: 01/01/2010
Field of study

Additive isotonic regression attempts to determine the relationship between a multi-dimensional observation variable and a response, under the constraint that the estimate is the additive sum of univariate component effects that are monotonically increasing. In this article, we present a new method for such regression called LASSO Isotone (LISO). LISO adapts ideas from sparse linear modelling to additive isotonic regression. Thus, it is viable in many situations with high dimensional predictor variables, where selection of significant versus insignificant variables are required. We suggest an algorithm involving a modification of the backfitting algorithm CPAV. We give a numerical convergence result, and finally examine some of its properties through simulations. We also suggest some possible extensions that improve performance, and allow calculation to be carried out when the direction of the monotonicity is unknown

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive

Data clustering based on Langevin annealing with a self-consistent potential

Author: Lafata Kyle
Liu Jian-Guo
Yin Fang-Fang
Zhou Zhennan
Publication venue
Publication date: 20/06/2018
Field of study

This paper introduces a novel data clustering algorithm based on Langevin dynamics, where the associated potential is constructed directly from the data. To introduce a self-consistent potential, we adopt the potential model from the established Quantum Clustering method. The first step is to use a radial basis function to construct a density distribution from the data. A potential function is then constructed such that this density distribution is the ground state solution to the time-independent Schrodinger equation. The second step is to use this potential function with the Langevin dynamics at sub-critical temperature to avoid ergodicity. The Langevin equations take a classical Gibbs distribution as the invariant measure, where the peaks of the distribution coincide with minima of the potential surface. The time dynamics of individual data points lead to different metastable states, which are interpreted as cluster centers. Clustering is therefore achieved when subsets of the data aggregate - as a result of the Langevin dynamics for a moderate period of time - in the neighborhood of a particular potential minimum. While the data points are pushed towards potential minima by the potential gradient, Brownian motion allows them to effectively tunnel through local potential barriers and escape saddle points into locations of the potential surface otherwise forbidden. The algorithm's feasibility is first established based on several illustrating examples and theoretical analyses, followed by a stricter evaluation using a standard benchmark dataset

arXiv.org e-Print Archive

DukeSpace (Duke Univ.)