Search CORE

1,235 research outputs found

The density connectivity information bottleneck

Author: Li Gang
Ren Yongli
Ye Yangdong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Clustering with the agglomerative Information Bottleneck (aIB) algorithm suffers from the sub-optimality problem, which cannot guarantee to preserve as much relative information as possible. To handle this problem, we introduce a density connectivity chain, by which we consider not only the information between two data elements, but also the information among the neighbors of a data element. Based on this idea, we propose DCIB, a Density Connectivity Information Bottleneck algorithm that applies the Information Bottleneck method to quantify the relative information during the clustering procedure. As a hierarchical algorithm, the DCIB algorithm produces a pruned clustering tree-structure and gets clustering results in different sizes in a single execution. The experiment results in the documentation clustering indicate that the DCIB algorithm can preserve more relative information and achieve higher precision than the aIB algorithm.<br /

COMBINATION OF AGGLOMERATIVE AND SEQUENTIAL CLUSTERING FOR SPEAKER DIARIZATION

Author: Bourlard Hervé
Valente Fabio
Vijayasenan Deepu
Publication venue: IDIAP
Publication date: 11/02/2010
Field of study

This paper aims at investigating the use of sequential clustering for speaker diarization. Conventional diarization systems are based on parametric models and agglomerative clustering. In our previous work we proposed a non-parametric method based on the agglomerative Information Bottleneck for very fast diarization. Here we consider the combination of sequential and agglomerative clustering for avoiding local maxima of the objective function and for purification. Experiments are run on the RT06 eval data. Sequential Clustering with oracle model selection can reduce the speaker error by

10\%

w.r.t. agglomerative clustering. When the model selection is based on Normalized Mutual Information criterion, a relative improvement of

5\%

is obtained using a combination of agglomerative and sequential clustering

An information theoretic approach to the functional classification of neurons

Author: Berry II Michael J.
Bialek William
Schneidman Elad
Publication venue
Publication date: 01/01/2002
Field of study

A population of neurons typically exhibits a broad diversity of responses to sensory inputs. The intuitive notion of functional classification is that cells can be clustered so that most of the diversity is captured in the identity of the clusters rather than by individuals within clusters. We show how this intuition can be made precise using information theory, without any need to introduce a metric on the space of stimuli or responses. Applied to the retinal ganglion cells of the salamander, this approach recovers classical results, but also provides clear evidence for subclasses beyond those identified previously. Further, we find that each of the ganglion cells is functionally unique, and that even within the same subclass only a few spikes are needed to reliably distinguish between cells.Comment: 13 pages, 4 figures. To appear in Advances in Neural Information Processing Systems (NIPS) 1

arXiv.org e-Print Archive

CiteSeerX

Machine learning of hierarchical clustering to segment 2D and 3D images

Author: Chklovskii Dmitri B.
Kennedy Ryan
Nunez-Iglesias Juan
Parag Toufiq
Shi Jianbo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

We aim to improve segmentation through the use of machine learning tools during region agglomeration. We propose an active learning approach for performing hierarchical agglomerative segmentation from superpixels. Our method combines multiple features at all scales of the agglomerative process, works for data with an arbitrary number of dimensions, and scales to very large datasets. We advocate the use of variation of information to measure segmentation accuracy, particularly in 3D electron microscopy (EM) images of neural tissue, and using this metric demonstrate an improvement over competing algorithms in EM and natural images.Comment: 15 pages, 8 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Privacy-Constrained Remote Source Coding

Author: Caire Giuseppe
Kittichokechai Kittipong
Publication venue
Publication date: 06/05/2016
Field of study

We consider the problem of revealing/sharing data in an efficient and secure way via a compact representation. The representation should ensure reliable reconstruction of the desired features/attributes while still preserve privacy of the secret parts of the data. The problem is formulated as a remote lossy source coding with a privacy constraint where the remote source consists of public and secret parts. Inner and outer bounds for the optimal tradeoff region of compression rate, distortion, and privacy leakage rate are given and shown to coincide for some special cases. When specializing the distortion measure to a logarithmic loss function, the resulting rate-distortion-leakage tradeoff for the case of identical side information forms an optimization problem which corresponds to the "secure" version of the so-called information bottleneck.Comment: 10 pages, 1 figure, to be presented at ISIT 201

arXiv.org e-Print Archive