25,284 research outputs found
Accuracy evaluation of overlapping and multi-resolution clustering algorithms on large datasets
Performance of clustering algorithms is evaluated with the help of accuracy
metrics. There is a great diversity of clustering algorithms, which are key
components of many data analysis and exploration systems. However, there exist
only few metrics for the accuracy measurement of overlapping and
multi-resolution clustering algorithms on large datasets. In this paper, we
first discuss existing metrics, how they satisfy a set of formal constraints,
and how they can be applied to specific cases. Then, we propose several
optimizations and extensions of these metrics. More specifically, we introduce
a new indexing technique to reduce both the runtime and the memory complexity
of the Mean F1 score evaluation. Our technique can be applied on large datasets
and it is faster on a single CPU than state-of-the-art implementations running
on high-performance servers. In addition, we propose several extensions of the
discussed metrics to improve their effectiveness and satisfaction to formal
constraints without affecting their efficiency. All the metrics discussed in
this paper are implemented in C++ and are available for free as open-source
packages that can be used either as stand-alone tools or as part of a
benchmarking system to compare various clustering algorithms.Comment: The application executable and sources:
https://github.com/eXascaleInfolab/xmeasure
Spike sorting for large, dense electrode arrays
Developments in microfabrication technology have enabled the production of neural electrode arrays with hundreds of closely spaced recording sites, and electrodes with thousands of sites are under development. These probes in principle allow the simultaneous recording of very large numbers of neurons. However, use of this technology requires the development of techniques for decoding the spike times of the recorded neurons from the raw data captured from the probes. Here we present a set of tools to solve this problem, implemented in a suite of practical, user-friendly, open-source software. We validate these methods on data from the cortex, hippocampus and thalamus of rat, mouse, macaque and marmoset, demonstrating error rates as low as 5%
Hierarchical Metric Learning for Optical Remote Sensing Scene Categorization
We address the problem of scene classification from optical remote sensing
(RS) images based on the paradigm of hierarchical metric learning. Ideally,
supervised metric learning strategies learn a projection from a set of training
data points so as to minimize intra-class variance while maximizing inter-class
separability to the class label space. However, standard metric learning
techniques do not incorporate the class interaction information in learning the
transformation matrix, which is often considered to be a bottleneck while
dealing with fine-grained visual categories. As a remedy, we propose to
organize the classes in a hierarchical fashion by exploring their visual
similarities and subsequently learn separate distance metric transformations
for the classes present at the non-leaf nodes of the tree. We employ an
iterative max-margin clustering strategy to obtain the hierarchical
organization of the classes. Experiment results obtained on the large-scale
NWPU-RESISC45 and the popular UC-Merced datasets demonstrate the efficacy of
the proposed hierarchical metric learning based RS scene recognition strategy
in comparison to the standard approaches.Comment: Undergoing revision in GRS
- …