Search CORE

1,753 research outputs found

Self-Supervised Representation Learning for Detection of ACL Tear Injury in Knee MR Videos

Author: Bhattacharya Saumik
Manna Siladittya
Pal Umapada
Publication venue
Publication date: 14/12/2020
Field of study

The success of deep learning based models for computer vision applications requires large scale human annotated data which are often expensive to generate. Self-supervised learning, a subset of unsupervised learning, handles this problem by learning meaningful features from unlabeled image or video data. In this paper, we propose a self-supervised learning approach to learn transferable features from MR video clips by enforcing the model to learn anatomical features. The pretext task models are designed to predict the correct ordering of the jumbled image patches that the MR video frames are divided into. To the best of our knowledge, none of the supervised learning models performing injury classification task from MR video provide any explanation for the decisions made by the models and hence makes our work the first of its kind on MR video data. Experiments on the pretext task show that this proposed approach enables the model to learn spatial context invariant features which help for reliable and explainable performance in downstream tasks like classification of Anterior Cruciate Ligament tear injury from knee MRI. The efficiency of the novel Convolutional Neural Network proposed in this paper is reflected in the experimental results obtained in the downstream task

arXiv.org e-Print Archive

Self-supervised Co-training for Video Representation Learning

Author: Han Tengda
Xie Weidi
Zisserman Andrew
Publication venue
Publication date: 01/01/2020
Field of study

The objective of this paper is visual-only self-supervised video representation learning. We make the following contributions: (i) we investigate the benefit of adding semantic-class positives to instance-based Info Noise Contrastive Estimation (InfoNCE) training, showing that this form of supervised contrastive learning leads to a clear improvement in performance; (ii) we propose a novel self-supervised co-training scheme to improve the popular infoNCE loss, exploiting the complementary information from different views, RGB streams and optical flow, of the same data source by using one view to obtain positive class samples for the other; (iii) we thoroughly evaluate the quality of the learnt representation on two different downstream tasks: action recognition and video retrieval. In both cases, the proposed approach demonstrates state-of-the-art or comparable performance with other self-supervised approaches, whilst being significantly more efficient to train, i.e. requiring far less training data to achieve similar performance.Comment: NeurIPS202

arXiv.org e-Print Archive

Oxford University Research Archive