6,043 research outputs found
Partitioning Relational Matrices of Similarities or Dissimilarities using the Value of Information
In this paper, we provide an approach to clustering relational matrices whose
entries correspond to either similarities or dissimilarities between objects.
Our approach is based on the value of information, a parameterized,
information-theoretic criterion that measures the change in costs associated
with changes in information. Optimizing the value of information yields a
deterministic annealing style of clustering with many benefits. For instance,
investigators avoid needing to a priori specify the number of clusters, as the
partitions naturally undergo phase changes, during the annealing process,
whereby the number of clusters changes in a data-driven fashion. The
global-best partition can also often be identified.Comment: Submitted to the IEEE International Conference on Acoustics, Speech,
and Signal Processing (ICASSP
Unsupervised extraction of recurring words from infant-directed speech
To date, most computational models of infant word segmentation have worked from phonemic or phonetic input, or have used toy datasets. In this paper, we present an algorithm for word extraction that works directly from naturalistic acoustic input: infant-directed speech from the CHILDES corpus. The algorithm identifies recurring acoustic patterns that are candidates for identification as words or phrases, and then clusters together the most similar patterns. The recurring patterns are found in a single pass through the corpus using an incremental method, where only a small number of utterances are considered at once. Despite this limitation, we show that the algorithm is able to extract a number of recurring words, including some that infants learn earliest, such as Mommy and the childâs name. We also introduce a novel information-theoretic evaluation measure
The Burbea-Rao and Bhattacharyya centroids
We study the centroid with respect to the class of information-theoretic
Burbea-Rao divergences that generalize the celebrated Jensen-Shannon divergence
by measuring the non-negative Jensen difference induced by a strictly convex
and differentiable function. Although those Burbea-Rao divergences are
symmetric by construction, they are not metric since they fail to satisfy the
triangle inequality. We first explain how a particular symmetrization of
Bregman divergences called Jensen-Bregman distances yields exactly those
Burbea-Rao divergences. We then proceed by defining skew Burbea-Rao
divergences, and show that skew Burbea-Rao divergences amount in limit cases to
compute Bregman divergences. We then prove that Burbea-Rao centroids are
unique, and can be arbitrarily finely approximated by a generic iterative
concave-convex optimization algorithm with guaranteed convergence property. In
the second part of the paper, we consider the Bhattacharyya distance that is
commonly used to measure overlapping degree of probability distributions. We
show that Bhattacharyya distances on members of the same statistical
exponential family amount to calculate a Burbea-Rao divergence in disguise.
Thus we get an efficient algorithm for computing the Bhattacharyya centroid of
a set of parametric distributions belonging to the same exponential families,
improving over former specialized methods found in the literature that were
limited to univariate or "diagonal" multivariate Gaussians. To illustrate the
performance of our Bhattacharyya/Burbea-Rao centroid algorithm, we present
experimental performance results for -means and hierarchical clustering
methods of Gaussian mixture models.Comment: 13 page
Optimal Kullback-Leibler Aggregation via Information Bottleneck
In this paper, we present a method for reducing a regular, discrete-time
Markov chain (DTMC) to another DTMC with a given, typically much smaller number
of states. The cost of reduction is defined as the Kullback-Leibler divergence
rate between a projection of the original process through a partition function
and a DTMC on the correspondingly partitioned state space. Finding the reduced
model with minimal cost is computationally expensive, as it requires an
exhaustive search among all state space partitions, and an exact evaluation of
the reduction cost for each candidate partition. Our approach deals with the
latter problem by minimizing an upper bound on the reduction cost instead of
minimizing the exact cost; The proposed upper bound is easy to compute and it
is tight if the original chain is lumpable with respect to the partition. Then,
we express the problem in the form of information bottleneck optimization, and
propose using the agglomerative information bottleneck algorithm for searching
a sub-optimal partition greedily, rather than exhaustively. The theory is
illustrated with examples and one application scenario in the context of
modeling bio-molecular interactions.Comment: 13 pages, 4 figure
- âŚ