Search CORE

18 research outputs found

RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning

Author: Chen Peihao
Gan Chuang
He Dongliang
Huang Deng
Long Xiang
Tan Mingkui
Wen Shilei
Zeng Runhao
Publication venue
Publication date: 15/03/2021
Field of study

We study unsupervised video representation learning that seeks to learn both motion and appearance features from unlabeled video only, which can be reused for downstream tasks such as action recognition. This task, however, is extremely challenging due to 1) the highly complex spatial-temporal information in videos; and 2) the lack of labeled data for training. Unlike the representation learning for static images, it is difficult to construct a suitable self-supervised task to well model both motion and appearance features. More recently, several attempts have been made to learn video representation through video playback speed prediction. However, it is non-trivial to obtain precise speed labels for the videos. More critically, the learnt models may tend to focus on motion pattern and thus may not learn appearance features well. In this paper, we observe that the relative playback speed is more consistent with motion pattern, and thus provide more effective and stable supervision for representation learning. Therefore, we propose a new way to perceive the playback speed and exploit the relative speed between two video clips as labels. In this way, we are able to well perceive speed and learn better motion features. Moreover, to ensure the learning of appearance features, we further propose an appearance-focused task, where we enforce the model to perceive the appearance difference between two video clips. We show that optimizing the two tasks jointly consistently improves the performance on two downstream tasks, namely action recognition and video retrieval. Remarkably, for action recognition on UCF101 dataset, we achieve 93.7% accuracy without the use of labeled data for pre-training, which outperforms the ImageNet supervised pre-trained model. Code and pre-trained models can be found at https://github.com/PeihaoChen/RSPNet.Comment: Accepted by AAAI-2021. Code and pre-trained models can be found at https://github.com/PeihaoChen/RSPNe

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Survey on Metric Learning for Feature Vectors and Structured Data

Author: Bellet Aurélien
Habrard Amaury
Sebban Marc
Publication venue
Publication date: 01/01/2013
Field of study

The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and related fields for the past ten years. This survey paper proposes a systematic review of the metric learning literature, highlighting the pros and cons of each approach. We pay particular attention to Mahalanobis distance metric learning, a well-studied and successful framework, but additionally present a wide range of methods that have recently emerged as powerful alternatives, including nonlinear metric learning, similarity learning and local metric learning. Recent trends and extensions, such as semi-supervised metric learning, metric learning for histogram data and the derivation of generalization guarantees, are also covered. Finally, this survey addresses metric learning for structured data, in particular edit distance learning, and attempts to give an overview of the remaining challenges in metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new method

arXiv.org e-Print Archive

HAL-UJM

Recommended from our members

The digital music lab: A big data infrastructure for digital musicology

Author: Abdallah S.
Al-Shakarchi E.
Auer S.
Barthet M.
Bello J. P.
Daniel Wolff
Duval E.
Emmanouil Benetos
Gartner R.
Grosche P.
Mauch M.
Moretti F.
Nicolas Gold
Noland K.
Porter A.
Porter A.
Raimond Y.
Samer Abdallah
Schöch C.
Schörkhuber C.
Steven Hargreaves
Tillman Weyde
Tzanetakis G.
Volk A.
Zaharia M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/08/2016
Field of study

In musicology and music research generally, the increasing availability of digital music, storage capacities, and computing power enable and require new and intelligent systems. In the transition from traditional to digital musicology, many techniques and tools have been developed for the analysis of individual pieces of music, but large-scale music data that are increasingly becoming available require research methods and systems that work on the collection-level and at scale. Although many relevant algorithms have been developed during the past 15 years of research in Music Information Retrieval, an integrated system that supports large-scale digital musicology research has so far been lacking. In the Digital Music Lab (DML) project, a collaboration among music librarians, musicologists, computer scientists, and human-computer interface specialists, the DML software system has been developed that fills this gap by providing intelligent large-scale music analysis with a user-friendly interactive interface supporting musicologists in their exploration and enquiry. The DML system empowers musicologists by addressing several challenges: distributed processing of audio and other music data, management of the data analysis process and results, remote analysis of data under copyright, logical inference on the extracted information and metadata, and visual web-based interfaces for exploring and querying the music collections. The DML system is scalable and based on SemanticWeb technology and integrates into Linked Data with the vision of a distributed system that enables music research across archives, libraries, and other providers of music data. A first DML system prototype has been set up in collaboration with the British Library and I Like Music Ltd. This system has been used to analyse a diverse corpus of currently 250,000 music tracks. In this article, we describe the DML system requirements, design, architecture, components, and available data sources, explaining their interaction. We report use cases and applications with initial evaluations of the proposed system

City Research Online

Crossref

UCL Discovery

Queen Mary Research Online