Search CORE

32 research outputs found

Information-theoretic analysis of multivariate single - cell signaling responses using SLEMI

Author: Blonski Slawomir
Jetka Tomasz
Komorowski Michal
Nienaltowski Karol
Winarski Tomasz
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 16/08/2018
Field of study

Mathematical methods of information theory constitute essential tools to describe how stimuli are encoded in activities of signaling effectors. Exploring the information-theoretic perspective, however, remains conceptually, experimentally and computationally challenging. Specifically, existing computational tools enable efficient analysis of relatively simple systems, usually with one input and output only. Moreover, their robust and readily applicable implementations are missing. Here, we propose a novel algorithm to analyze signaling data within the framework of information theory. Our approach enables robust as well as statistically and computationally efficient analysis of signaling systems with high-dimensional outputs and a large number of input values. Analysis of the NF-kB single - cell signaling responses to TNF-a uniquely reveals that the NF-kB signaling dynamics improves discrimination of high concentrations of TNF-a with a modest impact on discrimination of low concentrations. Our readily applicable R-package, SLEMI - statistical learning based estimation of mutual information, allows the approach to be used by computational biologists with only elementary knowledge of information theory

arXiv.org e-Print Archive

Directory of Open Access Journals

Ensemble estimation of multivariate f-divergence

Author: Hero III Alfred O.
Moon Kevin R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/06/2014
Field of study

f-divergence estimation is an important problem in the fields of information theory, machine learning, and statistics. While several divergence estimators exist, relatively few of their convergence rates are known. We derive the MSE convergence rate for a density plug-in estimator of f-divergence. Then by applying the theory of optimally weighted ensemble estimation, we derive a divergence estimator with a convergence rate of O(1/T) that is simple to implement and performs well in high dimensions. We validate our theoretical results with experiments.Comment: 14 pages, 6 figures, a condensed version of this paper was accepted to ISIT 2014, Version 2: Moved the proofs of the theorems from the main body to appendices at the en

arXiv.org e-Print Archive

Crossref

Finite-Sample Analysis of Fixed-k Nearest Neighbor Density Functional Estimators

Author: Póczos Barnabás
Singh Shashank
Publication venue
Publication date: 01/01/2016
Field of study

We provide finite-sample analysis of a general framework for using k-nearest neighbor statistics to estimate functionals of a nonparametric continuous probability density, including entropies and divergences. Rather than plugging a consistent density estimate (which requires

k \to \infty

as the sample size

n \to \infty

) into the functional of interest, the estimators we consider fix k and perform a bias correction. This is more efficient computationally, and, as we show in certain cases, statistically, leading to faster convergence rates. Our framework unifies several previous estimators, for most of which ours are the first finite sample guarantees.Comment: 16 pages, 0 figure

arXiv.org e-Print Archive

CiteSeerX

Big Variates: Visualizing and identifying key variables in a multivariate world

Author: Crow L.
Watts S. J.
Publication venue: 'Elsevier BV'
Publication date: 10/06/2019
Field of study

Big Data involves both a large number of events but also many variables. This paper will concentrate on the challenge presented by the large number of variables in a Big Dataset. It will start with a brief review of exploratory data visualisation for large dimensional datasets and the use of parallel coordinates. This motivates the use of information theoretic ideas to understand multivariate data. Two key information-theoretic statistics (Similarity Index and Class Distance Indicator) will be described which are used to identify the key variables and then guide the user in a subsequent machine learning analysis. Key to the approach is a novel algorithm to histogram data which quantifies the information content of the data. The Class Distance Indicator also sets a limit on the classification performance of machine learning algorithms for the specific dataset.Comment: 16 Pages, 7 Figures. Pre-print from talk at ULITIMA 2018, Argonne National Laboratory, 11-14 September 201

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications

Author: Alesiani Francesco
Principe Jose C.
Shaker Ammar
Yu Shujian
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 28/12/2020
Field of study

We propose a simple yet powerful test statistic to quantify the discrepancy between two conditional distributions. The new statistic avoids the explicit estimation of the underlying distributions in highdimensional space and it operates on the cone of symmetric positive semidefinite (SPS) matrix using the Bregman matrix divergence. Moreover, it inherits the merits of the correntropy function to explicitly incorporate high-order statistics in the data. We present the properties of our new statistic and illustrate its connections to prior art. We finally show the applications of our new statistic on three different machine learning problems, namely the multi-task learning over graphs, the concept drift detection, and the information-theoretic feature selection, to demonstrate its utility and advantage. Code of our statistic is available at https://bit.ly/BregmanCorrentropy.Comment: manuscript accepted at IJCAI 20; added additional notes on computational complexity and auto-differentiable property; code is available at https://github.com/SJYuCNEL/Bregman-Correntropy-Conditional-Divergenc

arXiv.org e-Print Archive

Crossref

K nearest neighbor equality: giving equal chance to all existing classes

Author: Irigoyen Garbizu Itziar
Jauregi Iztueta Ekaitz
Lazkano Ortega Elena
Mendialdua Beitia Iñigo
Sierra Araujo Basilio
Publication venue: Elsevier
Publication date: 23/07/2011
Field of study

The nearest neighbor classification method assigns an unclassified point to the class of the nearest case of a set of previously classified points. This rule is independent of the underlying joint distribution of the sample points and their classifications. An extension to this approach is the k-NN method, in which the classification of the unclassified point is made by following a voting criteria within the k nearest points. The method we present here extends the k-NN idea, searching in each class for the k nearest points to the unclassified point, and classifying it in the class which minimizes the mean distance between the unclassified point and the k nearest points within each class. As all classes can take part in the final selection process, we have called the new approach k Nearest Neighbor Equality (k-NNE). Experimental results we obtained empirically show the suitability of the k-NNE algorithm, and its effectiveness suggests that it could be added to the current list of distance based classifiers.This work has been supported by the Basque Country University and by the Basque Government under the research team grant program

Crossref

Archivo Digital para la Docencia y la Investigación

k-Nearest Neighbor Based Consistent Entropy Estimation for Hyperspherical Distributions

Author: Andrew Michael E
Li Shengqiao
Mnatsakanov Robert M
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2011
Field of study

A consistent entropy estimator for hyperspherical data is proposed based on the k-nearest neighbor (knn) approach. The asymptotic unbiasedness and consistency of the estimator are proved. Moreover, cross entropy and Kullback-Leibler (KL) divergence estimators are also discussed. Simulation studies are conducted to assess the performance of the estimators for models including uniform and von Mises-Fisher distributions. The proposed knn entropy estimator is compared with the moment based counterpart via simulations. The results show that these two methods are comparable

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

The Research Repository @ WVU (West Virginia University)