172 research outputs found
Additional Positive Enables Better Representation Learning for Medical Images
This paper presents a new way to identify additional positive pairs for BYOL,
a state-of-the-art (SOTA) self-supervised learning framework, to improve its
representation learning ability. Unlike conventional BYOL which relies on only
one positive pair generated by two augmented views of the same image, we argue
that information from different images with the same label can bring more
diversity and variations to the target features, thus benefiting representation
learning. To identify such pairs without any label, we investigate TracIn, an
instance-based and computationally efficient influence function, for BYOL
training. Specifically, TracIn is a gradient-based method that reveals the
impact of a training sample on a test sample in supervised learning. We extend
it to the self-supervised learning setting and propose an efficient batch-wise
per-sample gradient computation method to estimate the pairwise TracIn to
represent the similarity of samples in the mini-batch during training. For each
image, we select the most similar sample from other images as the additional
positive and pull their features together with BYOL loss. Experimental results
on two public medical datasets (i.e., ISIC 2019 and ChestX-ray) demonstrate
that the proposed method can improve the classification performance compared to
other competitive baselines in both semi-supervised and transfer learning
settings.Comment: 8 page
Learning to Skip for Language Modeling
Overparameterized large-scale language models have impressive generalization
performance of in-context few-shot learning. However, most language models
allocate the same amount of parameters or computation to each token,
disregarding the complexity or importance of the input data. We argue that in
language model pretraining, a variable amount of computation should be assigned
to different tokens, and this can be efficiently achieved via a simple routing
mechanism. Different from conventional early stopping techniques where tokens
can early exit at only early layers, we propose a more general method that
dynamically skips the execution of a layer (or module) for any input token with
a binary router. In our extensive evaluation across 24 NLP tasks, we
demonstrate that the proposed method can significantly improve the 1-shot
performance compared to other competitive baselines only at mild extra cost for
inference
A novel risk stratification model for STEMI after primary PCI: global longitudinal strain and deep neural network assisted myocardial contrast echocardiography quantitative analysis
BackgroundIn ST-segment elevation myocardial infarction (STEMI) with the restoration of TIMI 3 flow by percutaneous coronary intervention (PCI), visually defined microvascular obstruction (MVO) was shown to be the predictor of poor prognosis, but not an ideal risk stratification method. We intend to introduce deep neural network (DNN) assisted myocardial contrast echocardiography (MCE) quantitative analysis and propose a better risk stratification model.Methods194 STEMI patients with successful primary PCI with at least 6 months follow-up were included. MCE was performed within 48 h after PCI. The major adverse cardiovascular events (MACE) were defined as cardiac death, congestive heart failure, reinfarction, stroke, and recurrent angina. The perfusion parameters were derived from a DNN-based myocardial segmentation framework. Three patterns of visual microvascular perfusion (MVP) qualitative analysis: normal, delay, and MVO. Clinical markers and imaging features, including global longitudinal strain (GLS) were analyzed. A calculator for risk was constructed and validated with bootstrap resampling.ResultsThe time-cost for processing 7,403 MCE frames is 773 s. The correlation coefficients of microvascular blood flow (MBF) were 0.99 to 0.97 for intra-observer and inter-observer variability. 38 patients met MACE in 6-month follow-up. We proposed A risk prediction model based on MBF [HR: 0.93 (0.91–0.95)] in culprit lesion areas and GLS [HR: 0.80 (0.73–0.88)]. At the best risk threshold of 40%, the AUC was 0.95 (sensitivity: 0.84, specificity: 0.94), better than visual MVP method (AUC: 0.70, Sensitivity: 0.89, Specificity: 0.40, IDI: −0.49). The Kaplan-Meier curves showed that the proposed risk prediction model allowed for better risk stratification.ConclusionThe MBF + GLS model allowed more accurate risk stratification of STEMI after PCI than visual qualitative analysis. The DNN-assisted MCE quantitative analysis is an objective, efficient and reproducible method to evaluate microvascular perfusion
- …