120,149 research outputs found
Improvements to context based self-supervised learning
We develop a set of methods to improve on the results of self-supervised
learning using context. We start with a baseline of patch based arrangement
context learning and go from there. Our methods address some overt problems
such as chromatic aberration as well as other potential problems such as
spatial skew and mid-level feature neglect. We prevent problems with testing
generalization on common self-supervised benchmark tests by using different
datasets during our development. The results of our methods combined yield top
scores on all standard self-supervised benchmarks, including classification and
detection on PASCAL VOC 2007, segmentation on PASCAL VOC 2012, and "linear
tests" on the ImageNet and CSAIL Places datasets. We obtain an improvement over
our baseline method of between 4.0 to 7.1 percentage points on transfer
learning classification tests. We also show results on different standard
network architectures to demonstrate generalization as well as portability. All
data, models and programs are available at:
https://gdo-datasci.llnl.gov/selfsupervised/.Comment: Accepted paper at CVPR 201
A Knowledge-based Learning Framework for Self-supervised Pre-training Towards Enhanced Recognition of Medical Images
Self-supervised pre-training has become the priory choice to establish
reliable models for automated recognition of massive medical images, which are
routinely annotation-free, without semantics, and without guarantee of quality.
Note that this paradigm is still at its infancy and limited by closely related
open issues: 1) how to learn robust representations in an unsupervised manner
from unlabelled medical images of low diversity in samples? and 2) how to
obtain the most significant representations demanded by a high-quality
segmentation? Aiming at these issues, this study proposes a knowledge-based
learning framework towards enhanced recognition of medical images, which works
in three phases by synergizing contrastive learning and generative learning
models: 1) Sample Space Diversification: Reconstructive proxy tasks have been
enabled to embed a priori knowledge with context highlighted to diversify the
expanded sample space; 2) Enhanced Representation Learning: Informative
noise-contrastive estimation loss regularizes the encoder to enhance
representation learning of annotation-free images; 3) Correlated Optimization:
Optimization operations in pre-training the encoder and the decoder have been
correlated via image restoration from proxy tasks, targeting the need for
semantic segmentation. Extensive experiments have been performed on various
public medical image datasets (e.g., CheXpert and DRIVE) against the
state-of-the-art counterparts (e.g., SimCLR and MoCo), and results demonstrate
that: The proposed framework statistically excels in self-supervised
benchmarks, achieving 2.08, 1.23, 1.12, 0.76 and 1.38 percentage points
improvements over SimCLR in AUC/Dice. The proposed framework achieves
label-efficient semi-supervised learning, e.g., reducing the annotation cost by
up to 99% in pathological classification.Comment: 10 pages, 9 figures, 3 tables, submitted to IEEE-TM
Graph Based Semi-supervised Learning with Convolution Neural Networks to Classify Crisis Related Tweets
During time-critical situations such as natural disasters, rapid
classification of data posted on social networks by affected people is useful
for humanitarian organizations to gain situational awareness and to plan
response efforts. However, the scarcity of labeled data in the early hours of a
crisis hinders machine learning tasks thus delays crisis response. In this
work, we propose to use an inductive semi-supervised technique to utilize
unlabeled data, which is often abundant at the onset of a crisis event, along
with fewer labeled data. Specif- ically, we adopt a graph-based deep learning
framework to learn an inductive semi-supervised model. We use two real-world
crisis datasets from Twitter to evaluate the proposed approach. Our results
show significant improvements using unlabeled data as compared to only using
labeled data.Comment: 5 pages. arXiv admin note: substantial text overlap with
arXiv:1805.0515
- …