39,660 research outputs found
On clustering procedures and nonparametric mixture estimation
This paper deals with nonparametric estimation of conditional den-sities in
mixture models in the case when additional covariates are available. The
proposed approach consists of performing a prelim-inary clustering algorithm on
the additional covariates to guess the mixture component of each observation.
Conditional densities of the mixture model are then estimated using kernel
density estimates ap-plied separately to each cluster. We investigate the
expected L 1 -error of the resulting estimates and derive optimal rates of
convergence over classical nonparametric density classes provided the
clustering method is accurate. Performances of clustering algorithms are
measured by the maximal misclassification error. We obtain upper bounds of this
quantity for a single linkage hierarchical clustering algorithm. Lastly,
applications of the proposed method to mixture models involving elec-tricity
distribution data and simulated data are presented
Clustering Via Nonparametric Density Estimation: the R Package pdfCluster
The R package pdfCluster performs cluster analysis based on a nonparametric
estimate of the density of the observed variables. After summarizing the main
aspects of the methodology, we describe the features and the usage of the
package, and finally illustrate its working with the aid of two datasets
Modeling heterogeneity in random graphs through latent space models: a selective review
We present a selective review on probabilistic modeling of heterogeneity in
random graphs. We focus on latent space models and more particularly on
stochastic block models and their extensions that have undergone major
developments in the last five years
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
- …