10 research outputs found
Efficient Scopeformer: Towards Scalable and Rich Feature Extraction for Intracranial Hemorrhage Detection
The quality and richness of feature maps extracted by convolution neural
networks (CNNs) and vision Transformers (ViTs) directly relate to the robust
model performance. In medical computer vision, these information-rich features
are crucial for detecting rare cases within large datasets. This work presents
the "Scopeformer," a novel multi-CNN-ViT model for intracranial hemorrhage
classification in computed tomography (CT) images. The Scopeformer architecture
is scalable and modular, which allows utilizing various CNN architectures as
the backbone with diversified output features and pre-training strategies. We
propose effective feature projection methods to reduce redundancies among
CNN-generated features and to control the input size of ViTs. Extensive
experiments with various Scopeformer models show that the model performance is
proportional to the number of convolutional blocks employed in the feature
extractor. Using multiple strategies, including diversifying the pre-training
paradigms for CNNs, different pre-training datasets, and style transfer
techniques, we demonstrate an overall improvement in the model performance at
various computational budgets. Later, we propose smaller compute-efficient
Scopeformer versions with three different types of input and output ViT
configurations. Efficient Scopeformers use four different pre-trained CNN
architectures as feature extractors to increase feature richness. Our best
Efficient Scopeformer model achieved an accuracy of 96.94\% and a weighted
logarithmic loss of 0.083 with an eight times reduction in the number of
trainable parameters compared to the base Scopeformer. Another version of the
Efficient Scopeformer model further reduced the parameter space by almost 17
times with negligible performance reduction. Hybrid CNNs and ViTs might provide
the desired feature richness for developing accurate medical computer vision
model
The Importance of Robust Features in Mitigating Catastrophic Forgetting
Continual learning (CL) is an approach to address catastrophic forgetting,
which refers to forgetting previously learned knowledge by neural networks when
trained on new tasks or data distributions. The adversarial robustness has
decomposed features into robust and non-robust types and demonstrated that
models trained on robust features significantly enhance adversarial robustness.
However, no study has been conducted on the efficacy of robust features from
the lens of the CL model in mitigating catastrophic forgetting in CL. In this
paper, we introduce the CL robust dataset and train four baseline models on
both the standard and CL robust datasets. Our results demonstrate that the CL
models trained on the CL robust dataset experienced less catastrophic
forgetting of the previously learned tasks than when trained on the standard
dataset. Our observations highlight the significance of the features provided
to the underlying CL models, showing that CL robust features can alleviate
catastrophic forgetting
Exploring Robustness of Neural Networks through Graph Measures
Motivated by graph theory, artificial neural networks (ANNs) are
traditionally structured as layers of neurons (nodes), which learn useful
information by the passage of data through interconnections (edges). In the
machine learning realm, graph structures (i.e., neurons and connections) of
ANNs have recently been explored using various graph-theoretic measures linked
to their predictive performance. On the other hand, in network science
(NetSci), certain graph measures including entropy and curvature are known to
provide insight into the robustness and fragility of real-world networks. In
this work, we use these graph measures to explore the robustness of various
ANNs to adversarial attacks. To this end, we (1) explore the design space of
inter-layer and intra-layers connectivity regimes of ANNs in the graph domain
and record their predictive performance after training under different types of
adversarial attacks, (2) use graph representations for both inter-layer and
intra-layers connectivity regimes to calculate various graph-theoretic
measures, including curvature and entropy, and (3) analyze the relationship
between these graph measures and the adversarial performance of ANNs. We show
that curvature and entropy, while operating in the graph domain, can quantify
the robustness of ANNs without having to train these ANNs. Our results suggest
that the real-world networks, including brain networks, financial networks, and
social networks may provide important clues to the neural architecture search
for robust ANNs. We propose a search strategy that efficiently finds robust
ANNs amongst a set of well-performing ANNs without having a need to train all
of these ANNs.Comment: 18 pages, 15 figure
Trustworthy Medical Segmentation with Uncertainty Estimation
Deep Learning (DL) holds great promise in reshaping the healthcare systems given its precision, efficiency, and objectivity. However, the brittleness of DL models to noisy and out-of-distribution inputs is ailing their deployment in the clinic. Most systems produce point estimates without further information about model uncertainty or confidence. This paper introduces a new Bayesian deep learning framework for uncertainty quantification in segmentation neural networks, specifically encoder-decoder architectures. The proposed framework uses the first-order Taylor series approximation to propagate and learn the first two moments (mean and covariance) of the distribution of the model parameters given the training data by maximizing the evidence lower bound. The output consists of two maps: the segmented image and the uncertainty map of the segmentation. The uncertainty in the segmentation decisions is captured by the covariance matrix of the predictive distribution. We evaluate the proposed framework on medical image segmentation data from Magnetic Resonances Imaging and Computed Tomography scans. Our experiments on multiple benchmark datasets demonstrate that the proposed framework is more robust to noise and adversarial attacks as compared to state-of-the-art segmentation models. Moreover, the uncertainty map of the proposed framework associates low confidence (or equivalently high uncertainty) to patches in the test input images that are corrupted with noise, artifacts or adversarial attacks. Thus, the model can self-assess its segmentation decisions when it makes an erroneous prediction or misses part of the segmentation structures, e.g., tumor, by presenting higher values in the uncertainty map
SUPER-Net: Trustworthy Medical Image Segmentation with Uncertainty Propagation in Encoder-Decoder Networks
Deep Learning (DL) holds great promise in reshaping the healthcare industry
owing to its precision, efficiency, and objectivity. However, the brittleness
of DL models to noisy and out-of-distribution inputs is ailing their deployment
in the clinic. Most models produce point estimates without further information
about model uncertainty or confidence. This paper introduces a new Bayesian DL
framework for uncertainty quantification in segmentation neural networks:
SUPER-Net: trustworthy medical image Segmentation with Uncertainty Propagation
in Encoder-decodeR Networks. SUPER-Net analytically propagates, using Taylor
series approximations, the first two moments (mean and covariance) of the
posterior distribution of the model parameters across the nonlinear layers. In
particular, SUPER-Net simultaneously learns the mean and covariance without
expensive post-hoc Monte Carlo sampling or model ensembling. The output
consists of two simultaneous maps: the segmented image and its pixelwise
uncertainty map, which corresponds to the covariance matrix of the predictive
distribution. We conduct an extensive evaluation of SUPER-Net on medical image
segmentation of Magnetic Resonances Imaging and Computed Tomography scans under
various noisy and adversarial conditions. Our experiments on multiple benchmark
datasets demonstrate that SUPER-Net is more robust to noise and adversarial
attacks than state-of-the-art segmentation models. Moreover, the uncertainty
map of the proposed SUPER-Net associates low confidence (or equivalently high
uncertainty) to patches in the test input images that are corrupted with noise,
artifacts, or adversarial attacks. Perhaps more importantly, the model exhibits
the ability of self-assessment of its segmentation decisions, notably when
making erroneous predictions due to noise or adversarial examples
Deep Ensemble for Rotorcraft Attitude Prediction
Historically, the rotorcraft community has experienced a higher fatal
accident rate than other aviation segments, including commercial and general
aviation. Recent advancements in artificial intelligence (AI) and the
application of these technologies in different areas of our lives are both
intriguing and encouraging. When developed appropriately for the aviation
domain, AI techniques provide an opportunity to help design systems that can
address rotorcraft safety challenges. Our recent work demonstrated that AI
algorithms could use video data from onboard cameras and correctly identify
different flight parameters from cockpit gauges, e.g., indicated airspeed.
These AI-based techniques provide a potentially cost-effective solution,
especially for small helicopter operators, to record the flight state
information and perform post-flight analyses. We also showed that carefully
designed and trained AI systems could accurately predict rotorcraft attitude
(i.e., pitch and yaw) from outside scenes (images or video data). Ordinary
off-the-shelf video cameras were installed inside the rotorcraft cockpit to
record the outside scene, including the horizon. The AI algorithm could
correctly identify rotorcraft attitude at an accuracy in the range of 80\%. In
this work, we combined five different onboard camera viewpoints to improve
attitude prediction accuracy to 94\%. In this paper, five onboard camera views
included the pilot windshield, co-pilot windshield, pilot Electronic Flight
Instrument System (EFIS) display, co-pilot EFIS display, and the attitude
indicator gauge. Using video data from each camera view, we trained various
convolutional neural networks (CNNs), which achieved prediction accuracy in the
range of 79\% % to 90\% %. We subsequently ensembled the learned knowledge from
all CNNs and achieved an ensembled accuracy of 93.3\%
A deep learning framework for joint image restoration and recognition
Image restoration and recognition are important computer vision tasks representing an inherent part of autonomous systems. These two tasks are often implemented in a sequential manner, in which the restoration process is followed by a recognition. In contrast, this paper proposes a joint framework that simultaneously performs both tasks within a shared deep neural network architecture. This joint framework integrates the restoration and recognition tasks by incorporating: i) common layers, ii) restoration layers and iii) classification layers. The total loss function combines the restoration and classification losses. The proposed joint framework, based on capsules, provides an efficient solution that can cope with challenges due to noise, image rotations and occlusions. The developed framework has been validated and evaluated on a public vehicle logo dataset under various degradation conditions, including Gaussian noise, rotation and occlusion. The results show that the joint framework improves the accuracy compared with the single task networks