38,171 research outputs found
Exploratory Analysis of Functional Data via Clustering and Optimal Segmentation
We propose in this paper an exploratory analysis algorithm for functional
data. The method partitions a set of functions into clusters and represents
each cluster by a simple prototype (e.g., piecewise constant). The total number
of segments in the prototypes, , is chosen by the user and optimally
distributed among the clusters via two dynamic programming algorithms. The
practical relevance of the method is shown on two real world datasets
A new bandwidth selection criterion for using SVDD to analyze hyperspectral data
This paper presents a method for hyperspectral image classification that uses
support vector data description (SVDD) with the Gaussian kernel function. SVDD
has been a popular machine learning technique for single-class classification,
but selecting the proper Gaussian kernel bandwidth to achieve the best
classification performance is always a challenging problem. This paper proposes
a new automatic, unsupervised Gaussian kernel bandwidth selection approach
which is used with a multiclass SVDD classification scheme. The performance of
the multiclass SVDD classification scheme is evaluated on three frequently used
hyperspectral data sets, and preliminary results show that the proposed method
can achieve better performance than published results on these data sets
Information-based objective functions for active data selection
Learning can be made more efficient if we can actively select particularly salient data points. Within a Bayesian learning framework, objective functions are discussed that measure the expected informativeness of candidate measurements. Three alternative specifications of what we want to gain information about lead to three different criteria for data selection. All these criteria depend on the assumption that the hypothesis space is correct, which may prove to be their main weakness
A concave pairwise fusion approach to subgroup analysis
An important step in developing individualized treatment strategies is to
correctly identify subgroups of a heterogeneous population, so that specific
treatment can be given to each subgroup. In this paper, we consider the
situation with samples drawn from a population consisting of subgroups with
different means, along with certain covariates. We propose a penalized approach
for subgroup analysis based on a regression model, in which heterogeneity is
driven by unobserved latent factors and thus can be represented by using
subject-specific intercepts. We apply concave penalty functions to pairwise
differences of the intercepts. This procedure automatically divides the
observations into subgroups. We develop an alternating direction method of
multipliers algorithm with concave penalties to implement the proposed approach
and demonstrate its convergence. We also establish the theoretical properties
of our proposed estimator and determine the order requirement of the minimal
difference of signals between groups in order to recover them. These results
provide a sound basis for making statistical inference in subgroup analysis.
Our proposed method is further illustrated by simulation studies and analysis
of the Cleveland heart disease dataset
- …