6,633 research outputs found

    Partitioning Relational Matrices of Similarities or Dissimilarities using the Value of Information

    Full text link
    In this paper, we provide an approach to clustering relational matrices whose entries correspond to either similarities or dissimilarities between objects. Our approach is based on the value of information, a parameterized, information-theoretic criterion that measures the change in costs associated with changes in information. Optimizing the value of information yields a deterministic annealing style of clustering with many benefits. For instance, investigators avoid needing to a priori specify the number of clusters, as the partitions naturally undergo phase changes, during the annealing process, whereby the number of clusters changes in a data-driven fashion. The global-best partition can also often be identified.Comment: Submitted to the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP

    Tracking Cell Signals in Fluorescent Images

    Get PDF
    In this paper we present the techniques for tracking cell signal in GFP (Green Fluorescent Protein) images of growing cell colonies. We use such tracking for both data extraction and dynamic modeling of intracellular processes. The techniques are based on optimization of energy functions, which simultaneously determines cell correspondences, while estimating the mapping functions. In addition to spatial mappings such as affine and Thin-Plate Spline mapping, the cell growth and cell division histories must be estimated as well. Different levels of joint optimization are discussed. The most unusual tracking feature addressed in this paper is the possibility of one-to-two correspondences caused by cell division. A novel extended softassign algorithm for solutions of one-to-many correspondences is detailed in this paper. The techniques are demonstrated on three sets of data: growing bacillus Subtillus and e-coli colonies and a developing plant shoot apical meristem. The techniques are currently used by biologists for data extraction and hypothesis formation

    Statistical physics, mixtures of distributions, and the EM algorithm

    Get PDF
    We show that there are strong relationships between approaches to optmization and learning based on statistical physics or mixtures of experts. In particular, the EM algorithm can be interpreted as converging either to a local maximum of the mixtures model or to a saddle point solution to the statistical physics system. An advantage of the statistical physics approach is that it naturally gives rise to a heuristic continuation method, deterministic annealing, for finding good solutions

    How Many Dissimilarity/Kernel Self Organizing Map Variants Do We Need?

    Full text link
    In numerous applicative contexts, data are too rich and too complex to be represented by numerical vectors. A general approach to extend machine learning and data mining techniques to such data is to really on a dissimilarity or on a kernel that measures how different or similar two objects are. This approach has been used to define several variants of the Self Organizing Map (SOM). This paper reviews those variants in using a common set of notations in order to outline differences and similarities between them. It discusses the advantages and drawbacks of the variants, as well as the actual relevance of the dissimilarity/kernel SOM for practical applications

    Methods for fast and reliable clustering

    Get PDF
    • 

    corecore