190 research outputs found
Bidirectional Representation Learning from Transformers using Multimodal Electronic Health Record Data to Predict Depression
Advancements in machine learning algorithms have had a beneficial impact on
representation learning, classification, and prediction models built using
electronic health record (EHR) data. Effort has been put both on increasing
models' overall performance as well as improving their interpretability,
particularly regarding the decision-making process. In this study, we present a
temporal deep learning model to perform bidirectional representation learning
on EHR sequences with a transformer architecture to predict future diagnosis of
depression. This model is able to aggregate five heterogenous and
high-dimensional data sources from the EHR and process them in a temporal
manner for chronic disease prediction at various prediction windows. We applied
the current trend of pretraining and fine-tuning on EHR data to outperform the
current state-of-the-art in chronic disease prediction, and to demonstrate the
underlying relation between EHR codes in the sequence. The model generated the
highest increases of precision-recall area under the curve (PRAUC) from 0.70 to
0.76 in depression prediction compared to the best baseline model. Furthermore,
the self-attention weights in each sequence quantitatively demonstrated the
inner relationship between various codes, which improved the model's
interpretability. These results demonstrate the model's ability to utilize
heterogeneous EHR data to predict depression while achieving high accuracy and
interpretability, which may facilitate constructing clinical decision support
systems in the future for chronic disease screening and early detection.Comment: in IEEE Journal of Biomedical and Health Informatics (2021
Transformer Lesion Tracker
Evaluating lesion progression and treatment response via longitudinal lesion
tracking plays a critical role in clinical practice. Automated approaches for
this task are motivated by prohibitive labor costs and time consumption when
lesion matching is done manually. Previous methods typically lack the
integration of local and global information. In this work, we propose a
transformer-based approach, termed Transformer Lesion Tracker (TLT).
Specifically, we design a Cross Attention-based Transformer (CAT) to capture
and combine both global and local information to enhance feature extraction. We
also develop a Registration-based Anatomical Attention Module (RAAM) to
introduce anatomical information to CAT so that it can focus on useful feature
knowledge. A Sparse Selection Strategy (SSS) is presented for selecting
features and reducing memory footprint in Transformer training. In addition, we
use a global regression to further improve model performance. We conduct
experiments on a public dataset to show the superiority of our method and find
that our model performance has improved the average Euclidean center error by
at least 14.3% (6mm vs. 7mm) compared with the state-of-the-art (SOTA). Code is
available at https://github.com/TangWen920812/TLT.Comment: Accepted MICCAI 202
RPLHR-CT Dataset and Transformer Baseline for Volumetric Super-Resolution from CT Scans
In clinical practice, anisotropic volumetric medical images with low
through-plane resolution are commonly used due to short acquisition time and
lower storage cost. Nevertheless, the coarse resolution may lead to
difficulties in medical diagnosis by either physicians or computer-aided
diagnosis algorithms. Deep learning-based volumetric super-resolution (SR)
methods are feasible ways to improve resolution, with convolutional neural
networks (CNN) at their core. Despite recent progress, these methods are
limited by inherent properties of convolution operators, which ignore content
relevance and cannot effectively model long-range dependencies. In addition,
most of the existing methods use pseudo-paired volumes for training and
evaluation, where pseudo low-resolution (LR) volumes are generated by a simple
degradation of their high-resolution (HR) counterparts. However, the domain gap
between pseudo- and real-LR volumes leads to the poor performance of these
methods in practice. In this paper, we build the first public real-paired
dataset RPLHR-CT as a benchmark for volumetric SR, and provide baseline results
by re-implementing four state-of-the-art CNN-based methods. Considering the
inherent shortcoming of CNN, we also propose a transformer volumetric
super-resolution network (TVSRN) based on attention mechanisms, dispensing with
convolutions entirely. This is the first research to use a pure transformer for
CT volumetric SR. The experimental results show that TVSRN significantly
outperforms all baselines on both PSNR and SSIM. Moreover, the TVSRN method
achieves a better trade-off between the image quality, the number of
parameters, and the running time. Data and code are available at
https://github.com/smilenaxx/RPLHR-CT.Comment: Accepted MICCAI 202
Ultrasound Image Enhancement using CycleGAN and Perceptual Loss
Purpose: The objective of this work is to introduce an advanced framework
designed to enhance ultrasound images, especially those captured by portable
hand-held devices, which often produce lower quality images due to hardware
constraints. Additionally, this framework is uniquely capable of effectively
handling non-registered input ultrasound image pairs, addressing a common
challenge in medical imaging. Materials and Methods: In this retrospective
study, we utilized an enhanced generative adversarial network (CycleGAN) model
for ultrasound image enhancement across five organ systems. Perceptual loss,
derived from deep features of pretrained neural networks, is applied to ensure
the human-perceptual quality of the enhanced images. These images are compared
with paired images acquired from high resolution devices to demonstrate the
model's ability to generate realistic high-quality images across organ systems.
Results: Preliminary validation of the framework reveals promising performance
metrics. The model generates images that result in a Structural Similarity
Index (SSI) score of 0.722, Locally Normalized Cross-Correlation (LNCC) score
of 0.902 and 28.802 for the Peak Signal-to-Noise Ratio (PSNR) metric.
Conclusion: This work presents a significant advancement in medical imaging
through the development of a CycleGAN model enhanced with Perceptual Loss (PL),
effectively bridging the quality gap between ultrasound images from varied
devices. By training on paired images, the model not only improves image
quality but also ensures the preservation of vital anatomic structural content.
This approach may improve equity in access to healthcare by enhancing portable
device capabilities, although further validation and optimizations are
necessary for broader clinical application.Comment: 7 pages, 3 figure
A Multi-resolution Model for Histopathology Image Classification and Localization with Multiple Instance Learning
Histopathological images provide rich information for disease diagnosis.
Large numbers of histopathological images have been digitized into high
resolution whole slide images, opening opportunities in developing
computational image analysis tools to reduce pathologists' workload and
potentially improve inter- and intra- observer agreement. Most previous work on
whole slide image analysis has focused on classification or segmentation of
small pre-selected regions-of-interest, which requires fine-grained annotation
and is non-trivial to extend for large-scale whole slide analysis. In this
paper, we proposed a multi-resolution multiple instance learning model that
leverages saliency maps to detect suspicious regions for fine-grained grade
prediction. Instead of relying on expensive region- or pixel-level annotations,
our model can be trained end-to-end with only slide-level labels. The model is
developed on a large-scale prostate biopsy dataset containing 20,229 slides
from 830 patients. The model achieved 92.7% accuracy, 81.8% Cohen's Kappa for
benign, low grade (i.e. Grade group 1) and high grade (i.e. Grade group >= 2)
prediction, an area under the receiver operating characteristic curve (AUROC)
of 98.2% and an average precision (AP) of 97.4% for differentiating malignant
and benign slides. The model obtained an AUROC of 99.4% and an AP of 99.8% for
cancer detection on an external dataset.Comment: 9 pages, 6 figure
Self-tuning to the Hopf bifurcation in fluctuating systems
The problem of self-tuning a system to the Hopf bifurcation in the presence
of noise and periodic external forcing is discussed. We find that the response
of the system has a non-monotonic dependence on the noise-strength, and
displays an amplified response which is more pronounced for weaker signals. The
observed effect is to be distinguished from stochastic resonance. For the
feedback we have studied, the unforced self-tuned Hopf oscillator in the
presence of fluctuations exhibits sharp peaks in its spectrum. The implications
of our general results are briefly discussed in the context of sound detection
by the inner ear.Comment: 37 pages, 7 figures (8 figure files
- …