6,633 research outputs found
Partitioning Relational Matrices of Similarities or Dissimilarities using the Value of Information
In this paper, we provide an approach to clustering relational matrices whose
entries correspond to either similarities or dissimilarities between objects.
Our approach is based on the value of information, a parameterized,
information-theoretic criterion that measures the change in costs associated
with changes in information. Optimizing the value of information yields a
deterministic annealing style of clustering with many benefits. For instance,
investigators avoid needing to a priori specify the number of clusters, as the
partitions naturally undergo phase changes, during the annealing process,
whereby the number of clusters changes in a data-driven fashion. The
global-best partition can also often be identified.Comment: Submitted to the IEEE International Conference on Acoustics, Speech,
and Signal Processing (ICASSP
Tracking Cell Signals in Fluorescent Images
In this paper we present the techniques for tracking cell signal in GFP (Green Fluorescent Protein) images of growing cell colonies. We use such tracking for both data extraction and dynamic modeling of intracellular processes. The techniques are based on optimization of energy functions, which simultaneously determines cell correspondences, while estimating the mapping functions. In addition to spatial mappings such as affine and Thin-Plate Spline mapping, the cell growth and cell division histories must be estimated as well. Different levels of joint optimization are discussed. The most unusual tracking feature addressed in this paper is the possibility of one-to-two correspondences caused by cell division. A novel extended softassign algorithm for solutions of one-to-many correspondences is detailed in this paper. The techniques are demonstrated on three sets of data: growing bacillus Subtillus and e-coli colonies and a developing plant shoot apical meristem. The techniques are currently used by biologists for data extraction and hypothesis formation
Statistical physics, mixtures of distributions, and the EM algorithm
We show that there are strong relationships between approaches to optmization and learning based on statistical physics or mixtures of experts. In particular, the EM algorithm can be interpreted as converging either to a local maximum of the mixtures model or to a saddle point solution to the statistical physics system. An advantage of the statistical physics approach is that it naturally gives rise to a heuristic continuation method, deterministic annealing, for finding good solutions
How Many Dissimilarity/Kernel Self Organizing Map Variants Do We Need?
In numerous applicative contexts, data are too rich and too complex to be
represented by numerical vectors. A general approach to extend machine learning
and data mining techniques to such data is to really on a dissimilarity or on a
kernel that measures how different or similar two objects are. This approach
has been used to define several variants of the Self Organizing Map (SOM). This
paper reviews those variants in using a common set of notations in order to
outline differences and similarities between them. It discusses the advantages
and drawbacks of the variants, as well as the actual relevance of the
dissimilarity/kernel SOM for practical applications
- âŠ