Search CORE

161 research outputs found

Max-Sliced Mutual Information

Author: Goldfeld Ziv
Greenewald Kristjan
Tsur Dor
Publication venue
Publication date: 28/09/2023
Field of study

Quantifying the dependence between high-dimensional random variables is central to statistical learning and inference. Two classical methods are canonical correlation analysis (CCA), which identifies maximally correlated projected versions of the original variables, and Shannon's mutual information, which is a universal dependence measure that also captures high-order dependencies. However, CCA only accounts for linear dependence, which may be insufficient for certain applications, while mutual information is often infeasible to compute/estimate in high dimensions. This work proposes a middle ground in the form of a scalable information-theoretic generalization of CCA, termed max-sliced mutual information (mSMI). mSMI equals the maximal mutual information between low-dimensional projections of the high-dimensional variables, which reduces back to CCA in the Gaussian case. It enjoys the best of both worlds: capturing intricate dependencies in the data while being amenable to fast computation and scalable estimation from samples. We show that mSMI retains favorable structural properties of Shannon's mutual information, like variational forms and identification of independence. We then study statistical estimation of mSMI, propose an efficiently computable neural estimator, and couple it with formal non-asymptotic error bounds. We present experiments that demonstrate the utility of mSMI for several tasks, encompassing independence testing, multi-view representation learning, algorithmic fairness, and generative modeling. We observe that mSMI consistently outperforms competing methods with little-to-no computational overhead.Comment: Accepted at NeurIPS 202

arXiv.org e-Print Archive

Recommended from our members

New computational and statistical characterizations of neural network learning

Author: Gollakota Aravind
Publication venue
Publication date: 13/01/2024
Field of study

A foundational goal of machine learning theory is to characterize the inherent computational and statistical complexity of some of the most basic tasks in machine learning. In this thesis, we present new results concerning two such tasks in neural network learning and beyond. First, we study the question of when efficient algorithms can achieve high test accuracy on labeled data known to be consistent with a simple neural network. We present a set of results establishing the surprising computational intractability of this problem even in the benign setting where the inputs are drawn from a Gaussian, and the labels are perfectly consistent with a simple two-hidden-layer or even one-hidden-layer neural network. These hardness results illuminate what types of problem assumptions are necessary for efficient algorithms for this problem to be possible at all. Next, we investigate the problem of testing whether a learning algorithm has fit the data as well as its guarantee claims. This is a serious issue for agnostic supervised learning (i.e. supervised learning with no assumptions on the labels), where most efficient algorithms make simplifying distributional assumptions such as Gaussianity. But such assumptions can be hard to verify, meaning it can be hard to check whether the learner has actually succeeded. The recent elegant model of testable learning addresses this issue by replacing such hard-to-verify distributional assumptions with efficiently testable ones. We present both a broad algorithmic framework as well as a full statistical characterization of this model.Computer Science

Texas ScholarWorks

Privacy Preserving Domain Adaptation for Semantic Segmentation of Medical Images

Author: Rostami Mohammad
Stan Serban
Publication venue
Publication date: 10/07/2021
Field of study

Convolutional neural networks (CNNs) have led to significant improvements in tasks involving semantic segmentation of images. CNNs are vulnerable in the area of biomedical image segmentation because of distributional gap between two source and target domains with different data modalities which leads to domain shift. Domain shift makes data annotations in new modalities necessary because models must be retrained from scratch. Unsupervised domain adaptation (UDA) is proposed to adapt a model to new modalities using solely unlabeled target domain data. Common UDA algorithms require access to data points in the source domain which may not be feasible in medical imaging due to privacy concerns. In this work, we develop an algorithm for UDA in a privacy-constrained setting, where the source domain data is inaccessible. Our idea is based on encoding the information from the source samples into a prototypical distribution that is used as an intermediate distribution for aligning the target domain distribution with the source domain distribution. We demonstrate the effectiveness of our algorithm by comparing it to state-of-the-art medical image semantic segmentation approaches on two medical image semantic segmentation datasets

arXiv.org e-Print Archive