1,313 research outputs found
Consensus clustering approach to group brain connectivity matrices
A novel approach rooted on the notion of consensus clustering, a strategy
developed for community detection in complex networks, is proposed to cope with
the heterogeneity that characterizes connectivity matrices in health and
disease. The method can be summarized as follows:
(i) define, for each node, a distance matrix for the set of subjects by
comparing the connectivity pattern of that node in all pairs of subjects; (ii)
cluster the distance matrix for each node; (iii) build the consensus network
from the corresponding partitions; (iv) extract groups of subjects by finding
the communities of the consensus network thus obtained.
Differently from the previous implementations of consensus clustering, we
thus propose to use the consensus strategy to combine the information arising
from the connectivity patterns of each node. The proposed approach may be seen
either as an exploratory technique or as an unsupervised pre-training step to
help the subsequent construction of a supervised classifier. Applications on a
toy model and two real data sets, show the effectiveness of the proposed
methodology, which represents heterogeneity of a set of subjects in terms of a
weighted network, the consensus matrix
Maximum Margin Clustering for State Decomposition of Metastable Systems
When studying a metastable dynamical system, a prime concern is how to
decompose the phase space into a set of metastable states. Unfortunately, the
metastable state decomposition based on simulation or experimental data is
still a challenge. The most popular and simplest approach is geometric
clustering which is developed based on the classical clustering technique.
However, the prerequisites of this approach are: (1) data are obtained from
simulations or experiments which are in global equilibrium and (2) the
coordinate system is appropriately selected. Recently, the kinetic clustering
approach based on phase space discretization and transition probability
estimation has drawn much attention due to its applicability to more general
cases, but the choice of discretization policy is a difficult task. In this
paper, a new decomposition method designated as maximum margin metastable
clustering is proposed, which converts the problem of metastable state
decomposition to a semi-supervised learning problem so that the large margin
technique can be utilized to search for the optimal decomposition without phase
space discretization. Moreover, several simulation examples are given to
illustrate the effectiveness of the proposed method
A survey of outlier detection methodologies
Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review
Deep Metric Learning via Facility Location
Learning the representation and the similarity metric in an end-to-end
fashion with deep networks have demonstrated outstanding results for clustering
and retrieval. However, these recent approaches still suffer from the
performance degradation stemming from the local metric training procedure which
is unaware of the global structure of the embedding space.
We propose a global metric learning scheme for optimizing the deep metric
embedding with the learnable clustering function and the clustering metric
(NMI) in a novel structured prediction framework.
Our experiments on CUB200-2011, Cars196, and Stanford online products
datasets show state of the art performance both on the clustering and retrieval
tasks measured in the NMI and Recall@K evaluation metrics.Comment: Submission accepted at CVPR 201
Training from a Better Start Point: Active Self-Semi-Supervised Learning for Few Labeled Samples
Training with fewer annotations is a key issue for applying deep models to
various practical domains. To date, semi-supervised learning has achieved great
success in training with few annotations. However, confirmation bias increases
dramatically as the number of annotations decreases making it difficult to
continue reducing the number of annotations. Based on the observation that the
quality of pseudo-labels early in semi-supervised training plays an important
role in mitigating confirmation bias, in this paper we propose an active
self-semi-supervised learning (AS3L) framework. AS3L bootstraps semi-supervised
models with prior pseudo-labels (PPL), where PPL is obtained by label
propagation over self-supervised features. We illustrate that the accuracy of
PPL is not only affected by the quality of features, but also by the selection
of the labeled samples. We develop active learning and label propagation
strategies to obtain better PPL. Consequently, our framework can significantly
improve the performance of models in the case of few annotations while reducing
the training time. Experiments on four semi-supervised learning benchmarks
demonstrate the effectiveness of the proposed methods. Our method outperforms
the baseline method by an average of 7\% on the four datasets and outperforms
the baseline method in accuracy while taking about 1/3 of the training time.Comment: 12 pages, 8 figure
- …