190 research outputs found
Computing the Unique Information
Given a pair of predictor variables and a response variable, how much
information do the predictors have about the response, and how is this
information distributed between unique, redundant, and synergistic components?
Recent work has proposed to quantify the unique component of the decomposition
as the minimum value of the conditional mutual information over a constrained
set of information channels. We present an efficient iterative divergence
minimization algorithm to solve this optimization problem with convergence
guarantees and evaluate its performance against other techniques.Comment: To appear in 2018 IEEE International Symposium on Information Theory
(ISIT); 18 pages; 4 figures, 1 Table; Github link to source code:
https://github.com/infodeco/computeU
Causal Graphs Underlying Generative Models: Path to Learning with Limited Data
Training generative models that capture rich semantics of the data and
interpreting the latent representations encoded by such models are very
important problems in unsupervised learning. In this work, we provide a simple
algorithm that relies on perturbation experiments on latent codes of a
pre-trained generative autoencoder to uncover a causal graph that is implied by
the generative model. We leverage pre-trained attribute classifiers and perform
perturbation experiments to check for influence of a given latent variable on a
subset of attributes. Given this, we show that one can fit an effective causal
graph that models a structural equation model between latent codes taken as
exogenous variables and attributes taken as observed variables. One interesting
aspect is that a single latent variable controls multiple overlapping subsets
of attributes unlike conventional approach that tries to impose full
independence. Using a pre-trained RNN-based generative autoencoder trained on a
dataset of peptide sequences, we demonstrate that the learnt causal graph from
our algorithm between various attributes and latent codes can be used to
predict a specific property for sequences which are unseen. We compare
prediction models trained on either all available attributes or only the ones
in the Markov blanket and empirically show that in both the unsupervised and
supervised regimes, typically, using the predictor that relies on Markov
blanket attributes generalizes better for out-of-distribution sequences
Unique Informations and Deficiencies
Given two channels that convey information about the same random variable, we
introduce two measures of the unique information of one channel with respect to
the other. The two quantities are based on the notion of generalized weighted
Le Cam deficiencies and differ on whether one channel can approximate the other
by a randomization at either its input or output. We relate the proposed
quantities to an existing measure of unique information which we call the
minimum-synergy unique information. We give an operational interpretation of
the latter in terms of an upper bound on the one-way secret key rate and
discuss the role of the unique informations in the context of nonnegative
mutual information decompositions into unique, redundant and synergistic
components.Comment: 13 pages, 2 figures. The material in this manuscript was presented at
the 56th Annual Allerton Conference on Communication, Control, and Computing,
2018. This manuscript contains some corrections: most notably, Lemma 18 was
removed and Proposition 28 was corrected. The numbering of equations and
results in this version agrees with the numbering of the published versio
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
ENInst: Enhancing Weakly-supervised Low-shot Instance Segmentation
We address a weakly-supervised low-shot instance segmentation, an
annotation-efficient training method to deal with novel classes effectively.
Since it is an under-explored problem, we first investigate the difficulty of
the problem and identify the performance bottleneck by conducting systematic
analyses of model components and individual sub-tasks with a simple baseline
model. Based on the analyses, we propose ENInst with sub-task enhancement
methods: instance-wise mask refinement for enhancing pixel localization quality
and novel classifier composition for improving classification accuracy. Our
proposed method lifts the overall performance by enhancing the performance of
each sub-task. We demonstrate that our ENInst is 7.5 times more efficient in
achieving comparable performance to the existing fully-supervised few-shot
models and even outperforms them at times.Comment: Accepted at Pattern Recognition (PR
Target-oriented Domain Adaptation for Infrared Image Super-Resolution
Recent efforts have explored leveraging visible light images to enrich
texture details in infrared (IR) super-resolution. However, this direct
adaptation approach often becomes a double-edged sword, as it improves texture
at the cost of introducing noise and blurring artifacts. To address these
challenges, we propose the Target-oriented Domain Adaptation SRGAN (DASRGAN),
an innovative framework specifically engineered for robust IR super-resolution
model adaptation. DASRGAN operates on the synergy of two key components: 1)
Texture-Oriented Adaptation (TOA) to refine texture details meticulously, and
2) Noise-Oriented Adaptation (NOA), dedicated to minimizing noise transfer.
Specifically, TOA uniquely integrates a specialized discriminator,
incorporating a prior extraction branch, and employs a Sobel-guided adversarial
loss to align texture distributions effectively. Concurrently, NOA utilizes a
noise adversarial loss to distinctly separate the generative and Gaussian noise
pattern distributions during adversarial training. Our extensive experiments
confirm DASRGAN's superiority. Comparative analyses against leading methods
across multiple benchmarks and upsampling factors reveal that DASRGAN sets new
state-of-the-art performance standards. Code are available at
\url{https://github.com/yongsongH/DASRGAN}.Comment: 11 pages, 9 figure
Anomaly Detection with Complex Data Structures
Identifying anomalies with complex patterns is different from the conventional anomaly detection problem. Firstly, for cross-modal anomaly detection problems, a large portion of data instances within a multi-modal context is often not anomalous when they are viewed separately in each modality, but they present abnormal patterns or behaviors when multiple sources of information are jointly considered and analyzed. Secondly, for the attribution network anomaly detection problem, the definition of anomaly becomes more complicated and obscure. Apart from anomalous nodes whose nodal attributes are rather different from the majority reference nodes from a global perspective, nodes with nodal attributes deviate remarkably from their communities are also considered to be anomalies. Thirdly, given a specific task with the different data structures, the process of building a suitable and high-quality deep learning-based outlier detection system still highly relies on human expertise and laboring trials. It is also necessary to automatically search the suitable outlier detection models for different tasks. In this dissertation, we made a series of contributions to enable advanced anomaly detection techniques for complex data structures and discussing how to automatically design anomaly detection frameworks for various data structures
- …