49,540 research outputs found
LASSO ISOtone for High Dimensional Additive Isotonic Regression
Additive isotonic regression attempts to determine the relationship between a
multi-dimensional observation variable and a response, under the constraint
that the estimate is the additive sum of univariate component effects that are
monotonically increasing. In this article, we present a new method for such
regression called LASSO Isotone (LISO). LISO adapts ideas from sparse linear
modelling to additive isotonic regression. Thus, it is viable in many
situations with high dimensional predictor variables, where selection of
significant versus insignificant variables are required. We suggest an
algorithm involving a modification of the backfitting algorithm CPAV. We give a
numerical convergence result, and finally examine some of its properties
through simulations. We also suggest some possible extensions that improve
performance, and allow calculation to be carried out when the direction of the
monotonicity is unknown
Data clustering based on Langevin annealing with a self-consistent potential
This paper introduces a novel data clustering algorithm based on Langevin
dynamics, where the associated potential is constructed directly from the data.
To introduce a self-consistent potential, we adopt the potential model from the
established Quantum Clustering method. The first step is to use a radial basis
function to construct a density distribution from the data. A potential
function is then constructed such that this density distribution is the ground
state solution to the time-independent Schrodinger equation. The second step is
to use this potential function with the Langevin dynamics at sub-critical
temperature to avoid ergodicity. The Langevin equations take a classical Gibbs
distribution as the invariant measure, where the peaks of the distribution
coincide with minima of the potential surface. The time dynamics of individual
data points lead to different metastable states, which are interpreted as
cluster centers. Clustering is therefore achieved when subsets of the data
aggregate - as a result of the Langevin dynamics for a moderate period of time
- in the neighborhood of a particular potential minimum. While the data points
are pushed towards potential minima by the potential gradient, Brownian motion
allows them to effectively tunnel through local potential barriers and escape
saddle points into locations of the potential surface otherwise forbidden. The
algorithm's feasibility is first established based on several illustrating
examples and theoretical analyses, followed by a stricter evaluation using a
standard benchmark dataset
- …