19,015 research outputs found
When can unlabeled data improve the learning rate?
Göpfert C, Ben-David S, Bousquet O, Gelly S, Tolstikhin I, Urner R. When can unlabeled data improve the learning rate? In: Conference on Learning Theory (COLT). 2019
On Linear Separation Capacity of Self-Supervised Representation Learning
Recent advances in self-supervised learning have highlighted the efficacy of
data augmentation in learning data representation from unlabeled data. Training
a linear model atop these enhanced representations can yield an adept
classifier. Despite the remarkable empirical performance, the underlying
mechanisms that enable data augmentation to unravel nonlinear data structures
into linearly separable representations remain elusive. This paper seeks to
bridge this gap by investigating under what conditions learned representations
can linearly separate manifolds when data is drawn from a multi-manifold model.
Our investigation reveals that data augmentation offers additional information
beyond observed data and can thus improve the information-theoretic optimal
rate of linear separation capacity. In particular, we show that self-supervised
learning can linearly separate manifolds with a smaller distance than
unsupervised learning, underscoring the additional benefits of data
augmentation. Our theoretical analysis further underscores that the performance
of downstream linear classifiers primarily hinges on the linear separability of
data representations rather than the size of the labeled data set, reaffirming
the viability of constructing efficient classifiers with limited labeled data
amid an expansive unlabeled data set
Self-supervised Contrastive Representation Learning for Semi-supervised Time-Series Classification
Learning time-series representations when only unlabeled data or few labeled
samples are available can be a challenging task. Recently, contrastive
self-supervised learning has shown great improvement in extracting useful
representations from unlabeled data via contrasting different augmented views
of data. In this work, we propose a novel Time-Series representation learning
framework via Temporal and Contextual Contrasting (TS-TCC) that learns
representations from unlabeled data with contrastive learning. Specifically, we
propose time-series-specific weak and strong augmentations and use their views
to learn robust temporal relations in the proposed temporal contrasting module,
besides learning discriminative representations by our proposed contextual
contrasting module. Additionally, we conduct a systematic study of time-series
data augmentation selection, which is a key part of contrastive learning. We
also extend TS-TCC to the semi-supervised learning settings and propose a
Class-Aware TS-TCC (CA-TCC) that benefits from the available few labeled data
to further improve representations learned by TS-TCC. Specifically, we leverage
the robust pseudo labels produced by TS-TCC to realize a class-aware
contrastive loss. Extensive experiments show that the linear evaluation of the
features learned by our proposed framework performs comparably with the fully
supervised training. Additionally, our framework shows high efficiency in the
few labeled data and transfer learning scenarios. The code is publicly
available at \url{https://github.com/emadeldeen24/CA-TCC}.Comment: Accepted in the IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI). arXiv admin note: text overlap with arXiv:2106.1411
- …