17,728 research outputs found
Exploratory Analysis of Functional Data via Clustering and Optimal Segmentation
We propose in this paper an exploratory analysis algorithm for functional
data. The method partitions a set of functions into clusters and represents
each cluster by a simple prototype (e.g., piecewise constant). The total number
of segments in the prototypes, , is chosen by the user and optimally
distributed among the clusters via two dynamic programming algorithms. The
practical relevance of the method is shown on two real world datasets
Nonparametric Hierarchical Clustering of Functional Data
In this paper, we deal with the problem of curves clustering. We propose a
nonparametric method which partitions the curves into clusters and discretizes
the dimensions of the curve points into intervals. The cross-product of these
partitions forms a data-grid which is obtained using a Bayesian model selection
approach while making no assumptions regarding the curves. Finally, a
post-processing technique, aiming at reducing the number of clusters in order
to improve the interpretability of the clustering, is proposed. It consists in
optimally merging the clusters step by step, which corresponds to an
agglomerative hierarchical classification whose dissimilarity measure is the
variation of the criterion. Interestingly this measure is none other than the
sum of the Kullback-Leibler divergences between clusters distributions before
and after the merges. The practical interest of the approach for functional
data exploratory analysis is presented and compared with an alternative
approach on an artificial and a real world data set
Analysis and automatic annotation of singer's postures during concert and rehearsal
Bodily movement of music performers is widely acknowledged
to be a means of communication with the audience.
For singers, where the necessity of movement for sound
production is limited, postures, i.e. static positions of the
body, may be relevant in addition to actual movements. In
this study, we present the results of an analysis of a singerâs
postures, focusing on differences in postures between a
dress rehearsal without audience and a concert with audience.
We provide an analysis based on manual annotation
of postures and propose and evaluate methods for
automatic annotation of postures based on motion sensing
data, showing that automatic annotation is a viable alternative
to manual annotation. Results furthermore suggest
that the presence of an audience leads the singer to use
more âopenâ postures, and differentiate more between different
postures. Also, speed differences of transitions from
one posture to another are more pronounced in concert than
during rehearsal
An Application of Clustering Analysis to International Private Indebtedness
This paper presents a procedure for clustering analysis that combines Kohoneâs Self organizing Feature Map (SOFM) and statistical schemes. The idea is to cluster the data in two stages: run SOFM and then minimize the segmentation dispersion. The advantages of proposed procedure will be illustrated through a synthetic experiment and a real macroeconomic problem. The procedure is then used to explore the relationship between private indebtedness and some macroeconomic variables commonly used to measure macroeconomic performance. The experiences of thirty-nine countries in the early nineties are analyzed. The procedure outperformed others clustering techniques in the job of identifying consistent groups of countries from the economic and statistical viewpoints. It found out similarities in different countries concerning their respective levels of private indebtedness when added to well accepted parameters to measure macroeconomic performance.Vector quantization, Clustering, Self-Organizing Feature Map,Macroeconomic Performance, Private Indebtedness.
The application of clustering analysis to international private indebtedness
The main goal of this paper is to apply a combination of statistical and connectionist schemes to examine, via clustering analysis, private indebtedness in different countries. Thirty-nine such experiences are used. The relationship between private debts and some macroeconomic variables are discussed in some detail. The clustering performance is improved by taking advantage of specific properties and capacities of each method. The procedures are also applied to a controlled numerical example.
Batch and median neural gas
Neural Gas (NG) constitutes a very robust clustering algorithm given
euclidian data which does not suffer from the problem of local minima like
simple vector quantization, or topological restrictions like the
self-organizing map. Based on the cost function of NG, we introduce a batch
variant of NG which shows much faster convergence and which can be interpreted
as an optimization of the cost function by the Newton method. This formulation
has the additional benefit that, based on the notion of the generalized median
in analogy to Median SOM, a variant for non-vectorial proximity data can be
introduced. We prove convergence of batch and median versions of NG, SOM, and
k-means in a unified formulation, and we investigate the behavior of the
algorithms in several experiments.Comment: In Special Issue after WSOM 05 Conference, 5-8 september, 2005, Pari
Analyzing and clustering neural data
This thesis aims to analyze neural data in an overall effort by the Charles Stark
Draper Laboratory to determine an underlying pattern in brain activity in healthy
individuals versus patients with a brain degenerative disorder. The neural data comes from ECoG (electrocorticography) applied to either humans or primates. Each ECoG array has electrodes that measure voltage variations which neuroscientists claim correlates to neurons transmitting signals to one another. ECoG differs from the less invasive technique of EEG (electroencephalography) in that EEG electrodes are placed above a patients scalp while ECoG involves drilling small holes in the skull to allow electrodes to be closer to the brain. Because of this ECoG boasts an exceptionally high signal-to-noise ratio and less susceptibility to artifacts than EEG [6]. While wearing the ECoG caps, the patients are asked to perform a range of different tasks.
The tasks performed by patients are partitioned into different levels of mental stress
i.e. how much concentration is presumably required. The specific dataset used in
this thesis is derived from cognitive behavior experiments performed on primates at
MGH (Massachusetts General Hospital).
The content of this thesis can be thought of as a pipelined process. First the
data is collected from the ECoG electrodes, then the data is pre-processed via signal processing techniques and finally the data is clustered via unsupervised learning techniques. For both the pre-processing and the clustering steps, different techniques are applied and then compared against one another. The focus of this thesis is to evaluate clustering techniques when applied to neural data.
For the pre-processing step, two types of bandpass filters, a Butterworth Filter
and a Chebyshev Filter were applied. For the clustering step three techniques were
applied to the data, K-means Clustering, Spectral Clustering and Self-Tuning Spectral Clustering. We conclude that for pre-processing the results from both filters are very similar and thus either filter is sufficient. For clustering we conclude that K- means has the lowest amount of overlap between clusters. K-means is also the most time-efficient of the three techniques and is thus the ideal choice for this application.2016-10-27T00:00:00
- âŠ