17 research outputs found
NSOTree: Neural Survival Oblique Tree
Survival analysis is a statistical method employed to scrutinize the duration
until a specific event of interest transpires, known as time-to-event
information characterized by censorship. Recently, deep learning-based methods
have dominated this field due to their representational capacity and
state-of-the-art performance. However, the black-box nature of the deep neural
network hinders its interpretability, which is desired in real-world survival
applications but has been largely neglected by previous works. In contrast,
conventional tree-based methods are advantageous with respect to
interpretability, while consistently grappling with an inability to approximate
the global optima due to greedy expansion. In this paper, we leverage the
strengths of both neural networks and tree-based methods, capitalizing on their
ability to approximate intricate functions while maintaining interpretability.
To this end, we propose a Neural Survival Oblique Tree (NSOTree) for survival
analysis. Specifically, the NSOTree was derived from the ReLU network and can
be easily incorporated into existing survival models in a plug-and-play
fashion. Evaluations on both simulated and real survival datasets demonstrated
the effectiveness of the proposed method in terms of performance and
interpretability.Comment: 12 page
Data-Driven Approaches to Solve Inverse Problems
The main purpose of this thesis is to discuss data-driven approaches to solve inverse problems in image reconstruction. In the Bayesian framework, the image prior serves as a regularizer in the computation of a maximum-a-posterior estimation of the reconstructed image. Classical image priors include Gaussian random space(e.g. Tikhonov regularization) or Besov prior (e.g. Total Variation regularization). Inspired by generative adversarial networks, a critic (discriminator) can serve as a regularizer, because of its capability of distinguishing the distribution of the ground-truth images from the distribution of the naively reconstructed images with classical regularization functional. Another data-driven approach, regularization by denoising (RED), provides a flexible and effective way to combine the state-of-the-art denoisers and model-based methods with a variety of optimization strategies to solve the inverse problem. Unlike traditionally hand-crafted regularizers, the data-driven regularization has the potential to learn an optimal regularizer from the data. In this thesis, we will consider two widely used linear forward models, and two data-driven approaches to solve inverse problem: adversarial regularizer and regularization by denoising
SC-VAE: Sparse Coding-based Variational Autoencoder
Learning rich data representations from unlabeled data is a key challenge
towards applying deep learning algorithms in downstream supervised tasks.
Several variants of variational autoencoders have been proposed to learn
compact data representaitons by encoding high-dimensional data in a lower
dimensional space. Two main classes of VAEs methods may be distinguished
depending on the characteristics of the meta-priors that are enforced in the
representation learning step. The first class of methods derives a continuous
encoding by assuming a static prior distribution in the latent space. The
second class of methods learns instead a discrete latent representation using
vector quantization (VQ) along with a codebook. However, both classes of
methods suffer from certain challenges, which may lead to suboptimal image
reconstruction results. The first class of methods suffers from posterior
collapse, whereas the second class of methods suffers from codebook collapse.
To address these challenges, we introduce a new VAE variant, termed SC-VAE
(sparse coding-based VAE), which integrates sparse coding within variational
autoencoder framework. Instead of learning a continuous or discrete latent
representation, the proposed method learns a sparse data representation that
consists of a linear combination of a small number of learned atoms. The sparse
coding problem is solved using a learnable version of the iterative shrinkage
thresholding algorithm (ISTA). Experiments on two image datasets demonstrate
that our model can achieve improved image reconstruction results compared to
state-of-the-art methods. Moreover, the use of learned sparse code vectors
allows us to perform downstream task like coarse image segmentation through
clustering image patches.Comment: 15 pages, 11 figures, and 3 table
PDL: Regularizing Multiple Instance Learning with Progressive Dropout Layers
Multiple instance learning (MIL) was a weakly supervised learning approach
that sought to assign binary class labels to collections of instances known as
bags. However, due to their weak supervision nature, the MIL methods were
susceptible to overfitting and required assistance in developing comprehensive
representations of target instances. While regularization typically effectively
combated overfitting, its integration with the MIL model has been frequently
overlooked in prior studies. Meanwhile, current regularization methods for MIL
have shown limitations in their capacity to uncover a diverse array of
representations. In this study, we delve into the realm of regularization
within the MIL model, presenting a novel approach in the form of a Progressive
Dropout Layer (PDL). We aim to not only address overfitting but also empower
the MIL model in uncovering intricate and impactful feature representations.
The proposed method was orthogonal to existing MIL methods and could be easily
integrated into them to boost performance. Our extensive evaluation across a
range of MIL benchmark datasets demonstrated that the incorporation of the PDL
into multiple MIL methods not only elevated their classification performance
but also augmented their potential for weakly-supervised feature localizations.Comment: The code is available in https://github.com/ChongQingNoSubway/PD
TetCNN: Convolutional Neural Networks on Tetrahedral Meshes
Convolutional neural networks (CNN) have been broadly studied on images,
videos, graphs, and triangular meshes. However, it has seldom been studied on
tetrahedral meshes. Given the merits of using volumetric meshes in applications
like brain image analysis, we introduce a novel interpretable graph CNN
framework for the tetrahedral mesh structure. Inspired by ChebyNet, our model
exploits the volumetric Laplace-Beltrami Operator (LBO) to define filters over
commonly used graph Laplacian which lacks the Riemannian metric information of
3D manifolds. For pooling adaptation, we introduce new objective functions for
localized minimum cuts in the Graclus algorithm based on the LBO. We employ a
piece-wise constant approximation scheme that uses the clustering assignment
matrix to estimate the LBO on sampled meshes after each pooling. Finally,
adapting the Gradient-weighted Class Activation Mapping algorithm for
tetrahedral meshes, we use the obtained heatmaps to visualize discovered
regions-of-interest as biomarkers. We demonstrate the effectiveness of our
model on cortical tetrahedral meshes from patients with Alzheimer's disease, as
there is scientific evidence showing the correlation of cortical thickness to
neurodegenerative disease progression. Our results show the superiority of our
LBO-based convolution layer and adapted pooling over the conventionally used
unitary cortical thickness, graph Laplacian, and point cloud representation.Comment: Accepted as a conference paper to Information Processing in Medical
Imaging (IPMI 2023) conferenc
NNMobile-Net: Rethinking CNN Design for Deep Learning-Based Retinopathy Research
Retinal diseases (RD) are the leading cause of severe vision loss or
blindness. Deep learning-based automated tools play an indispensable role in
assisting clinicians in diagnosing and monitoring RD in modern medicine.
Recently, an increasing number of works in this field have taken advantage of
Vision Transformer to achieve state-of-the-art performance with more parameters
and higher model complexity compared to Convolutional Neural Networks (CNNs).
Such sophisticated and task-specific model designs, however, are prone to be
overfitting and hinder their generalizability. In this work, we argue that a
channel-aware and well-calibrated CNN model may overcome these problems. To
this end, we empirically studied CNN's macro and micro designs and its training
strategies. Based on the investigation, we proposed a no-new-MobleNet
(nn-MobileNet) developed for retinal diseases. In our experiments, our generic,
simple and efficient model superseded most current state-of-the-art methods on
four public datasets for multiple tasks, including diabetic retinopathy
grading, fundus multi-disease detection, and diabetic macular edema
classification. Our work may provide novel insights into deep learning
architecture design and advance retinopathy research.Comment: Code will publish soon:
https://github.com/Retinal-Research/NNMOBILE-NE
Optimized Live 4K Video Multicast
4K videos are becoming increasingly popular. However, despite advances in
wireless technology, streaming 4K videos over mmWave to multiple users is
facing significant challenges arising from directional communication,
unpredictable channel fluctuation and high bandwidth requirements. This paper
develops a novel 4K layered video multicast system. We (i) develop a video
quality model for layered video coding, (ii) optimize resource allocation,
scheduling, and beamforming based on the channel conditions of different users,
and (iii) put forward a streaming strategy that uses fountain code to avoid
redundancy across multicast groups and a Leaky-Bucket-based congestion control.
We realize an end-to-end system on commodity-off-the-shelf (COTS) WiGig
devices. We demonstrate the effectiveness of our system with extensive testbed
experiments and emulation
OTRE: Where Optimal Transport Guided Unpaired Image-to-Image Translation Meets Regularization by Enhancing
Non-mydriatic retinal color fundus photography (CFP) is widely available due
to the advantage of not requiring pupillary dilation, however, is prone to poor
quality due to operators, systemic imperfections, or patient-related causes.
Optimal retinal image quality is mandated for accurate medical diagnoses and
automated analyses. Herein, we leveraged the Optimal Transport (OT) theory to
propose an unpaired image-to-image translation scheme for mapping low-quality
retinal CFPs to high-quality counterparts. Furthermore, to improve the
flexibility, robustness, and applicability of our image enhancement pipeline in
the clinical practice, we generalized a state-of-the-art model-based image
reconstruction method, regularization by denoising, by plugging in priors
learned by our OT-guided image-to-image translation network. We named it as
regularization by enhancing (RE). We validated the integrated framework, OTRE,
on three publicly available retinal image datasets by assessing the quality
after enhancement and their performance on various downstream tasks, including
diabetic retinopathy grading, vessel segmentation, and diabetic lesion
segmentation. The experimental results demonstrated the superiority of our
proposed framework over some state-of-the-art unsupervised competitors and a
state-of-the-art supervised method.Comment: Accepted as a conference paper to The 28th biennial international
conference on Information Processing in Medical Imaging (IPMI 2023
Recommended from our members
Reconstructing growth and dynamic trajectories from single-cell transcriptomics data.
Time-series single-cell RNA sequencing (scRNA-seq) datasets provide unprecedented opportunities to learn dynamic processes of cellular systems. Due to the destructive nature of sequencing, it remains challenging to link the scRNA-seq snapshots sampled at different time points. Here we present TIGON, a dynamic, unbalanced optimal transport algorithm that reconstructs dynamic trajectories and population growth simultaneously as well as the underlying gene regulatory network from multiple snapshots. To tackle the high-dimensional optimal transport problem, we introduce a deep learning method using a dimensionless formulation based on the Wasserstein-Fisher-Rao (WFR) distance. TIGON is evaluated on simulated data and compared with existing methods for its robustness and accuracy in predicting cell state transition and cell population growth. Using three scRNA-seq datasets, we show the importance of growth in the temporal inference, TIGONs capability in reconstructing gene expression at unmeasured time points and its applications to temporal gene regulatory networks and cell-cell communication inference