20,865 research outputs found

    SOM-VAE: Interpretable Discrete Representation Learning on Time Series

    Full text link
    High-dimensional time series are common in many domains. Since human cognition is not optimized to work well in high-dimensional spaces, these areas could benefit from interpretable low-dimensional representations. However, most representation learning algorithms for time series data are difficult to interpret. This is due to non-intuitive mappings from data features to salient properties of the representation and non-smoothness over time. To address this problem, we propose a new representation learning framework building on ideas from interpretable discrete dimensionality reduction and deep generative modeling. This framework allows us to learn discrete representations of time series, which give rise to smooth and interpretable embeddings with superior clustering performance. We introduce a new way to overcome the non-differentiability in discrete representation learning and present a gradient-based version of the traditional self-organizing map algorithm that is more performant than the original. Furthermore, to allow for a probabilistic interpretation of our method, we integrate a Markov model in the representation space. This model uncovers the temporal transition structure, improves clustering performance even further and provides additional explanatory insights as well as a natural representation of uncertainty. We evaluate our model in terms of clustering performance and interpretability on static (Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST images, a chaotic Lorenz attractor system with two macro states, as well as on a challenging real world medical time series application on the eICU data set. Our learned representations compare favorably with competitor methods and facilitate downstream tasks on the real world data.Comment: Accepted for publication at the Seventh International Conference on Learning Representations (ICLR 2019

    Winner-relaxing and winner-enhancing Kohonen maps: Maximal mutual information from enhancing the winner

    Full text link
    The magnification behaviour of a generalized family of self-organizing feature maps, the Winner Relaxing and Winner Enhancing Kohonen algorithms is analyzed by the magnification law in the one-dimensional case, which can be obtained analytically. The Winner-Enhancing case allows to acheive a magnification exponent of one and therefore provides optimal mapping in the sense of information theory. A numerical verification of the magnification law is included, and the ordering behaviour is analyzed. Compared to the original Self-Organizing Map and some other approaches, the generalized Winner Enforcing Algorithm requires minimal extra computations per learning step and is conveniently easy to implement.Comment: 6 pages, 5 figures. For an extended version refer to cond-mat/0208414 (Neural Computation 17, 996-1009

    Winner-Relaxing Self-Organizing Maps

    Full text link
    A new family of self-organizing maps, the Winner-Relaxing Kohonen Algorithm, is introduced as a generalization of a variant given by Kohonen in 1991. The magnification behaviour is calculated analytically. For the original variant a magnification exponent of 4/7 is derived; the generalized version allows to steer the magnification in the wide range from exponent 1/2 to 1 in the one-dimensional case, thus provides optimal mapping in the sense of information theory. The Winner Relaxing Algorithm requires minimal extra computations per learning step and is conveniently easy to implement.Comment: 14 pages (6 figs included). To appear in Neural Computatio

    Probabilistic estimation of microarray data reliability and underlying gene expression

    Get PDF
    Background: The availability of high throughput methods for measurement of mRNA concentrations makes the reliability of conclusions drawn from the data and global quality control of samples and hybridization important issues. We address these issues by an information theoretic approach, applied to discretized expression values in replicated gene expression data. Results: Our approach yields a quantitative measure of two important parameter classes: First, the probability P(σS)P(\sigma | S) that a gene is in the biological state σ\sigma in a certain variety, given its observed expression SS in the samples of that variety. Second, sample specific error probabilities which serve as consistency indicators of the measured samples of each variety. The method and its limitations are tested on gene expression data for developing murine B-cells and a tt-test is used as reference. On a set of known genes it performs better than the tt-test despite the crude discretization into only two expression levels. The consistency indicators, i.e. the error probabilities, correlate well with variations in the biological material and thus prove efficient. Conclusions: The proposed method is effective in determining differential gene expression and sample reliability in replicated microarray data. Already at two discrete expression levels in each sample, it gives a good explanation of the data and is comparable to standard techniques.Comment: 11 pages, 4 figure

    Background modeling by shifted tilings of stacked denoising autoencoders

    Get PDF
    The effective processing of visual data without interruption is currently of supreme importance. For that purpose, the analysis system must adapt to events that may affect the data quality and maintain its performance level over time. A methodology for background modeling and foreground detection, whose main characteristic is its robustness against stationary noise, is presented in the paper. The system is based on a stacked denoising autoencoder which extracts a set of significant features for each patch of several shifted tilings of the video frame. A probabilistic model for each patch is learned. The distinct patches which include a particular pixel are considered for that pixel classification. The experiments show that classical methods existing in the literature experience drastic performance drops when noise is present in the video sequences, whereas the proposed one seems to be slightly affected. This fact corroborates the idea of robustness of our proposal, in addition to its usefulness for the processing and analysis of continuous data during uninterrupted periods of time.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Minimum Energy Information Fusion in Sensor Networks

    Full text link
    In this paper we consider how to organize the sharing of information in a distributed network of sensors and data processors so as to provide explanations for sensor readings with minimal expenditure of energy. We point out that the Minimum Description Length principle provides an approach to information fusion that is more naturally suited to energy minimization than traditional Bayesian approaches. In addition we show that for networks consisting of a large number of identical sensors Kohonen self-organization provides an exact solution to the problem of combining the sensor outputs into minimal description length explanations.Comment: postscript, 8 pages. Paper 65 in Proceedings of The 2nd International Conference on Information Fusio

    Measuring concept similarities in multimedia ontologies: analysis and evaluations

    Get PDF
    The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing
    corecore