Search CORE

4 research outputs found

MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations

Author: Budgett Sam
Heggan Calum
Hospedales Tim
Yaghoobi Mehrdad
Publication venue
Publication date: 29/05/2023
Field of study

Contrastive self-supervised learning has gained attention for its ability to create high-quality representations from large unlabelled data sets. A key reason that these powerful features enable data-efficient learning of downstream tasks is that they provide augmentation invariance, which is often a useful inductive bias. However, the amount and type of invariances preferred is not known apriori, and varies across different downstream tasks. We therefore propose a multi-task self-supervised framework (MT-SLVR) that learns both variant and invariant features in a parameter-efficient manner. Our multi-task representation provides a strong and flexible feature that benefits diverse downstream tasks. We evaluate our approach on few-shot classification tasks drawn from a variety of audio domains and demonstrate improved classification performance on all of themComment: Last author version accepted to InterSpeech23. 5 page

arXiv.org e-Print Archive

MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations

Author: Budgett Sam
Heggan Calum
Hospedales Tim
Yaghoobi Mehrdad
Publication venue
Publication date: 20/08/2023
Field of study

Edinburgh Research Explorer

MetaAudio: A Few-Shot Audio Classification Benchmark

Author: Budgett Sam
Heggan Calum
Hospedales Timothy
Yaghoobi Mehrdad
Publication venue
Publication date: 10/04/2022
Field of study

Currently available benchmarks for few-shot learning (machine learning with few training examples) are limited in the domains they cover, primarily focusing on image classification. This work aims to alleviate this reliance on image-based benchmarks by offering the first comprehensive, public and fully reproducible audio based alternative, covering a variety of sound domains and experimental settings. We compare the few-shot classification performance of a variety of techniques on seven audio datasets (spanning environmental sounds to human-speech). Extending this, we carry out in-depth analyses of joint training (where all datasets are used during training) and cross-dataset adaptation protocols, establishing the possibility of a generalised audio few-shot classification algorithm. Our experimentation shows gradient-based meta-learning methods such as MAML and Meta-Curvature consistently outperform both metric and baseline methods. We also demonstrate that the joint training routine helps overall generalisation for the environmental sound databases included, as well as being a somewhat-effective method of tackling the cross-dataset/domain setting.Comment: 9 pages with 1 figure and 2 main results tables. V1 Preprin

arXiv.org e-Print Archive

Edinburgh Research Explorer

MetaAudio: A Few-Shot Audio Classification Benchmark

Author: Budgett Sam
Heggan Calum
Hospedales Timothy M
Yaghoobi Vaighan Mehrdad
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/09/2022
Field of study

Edinburgh Research Explorer