Search CORE

5 research outputs found

COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers

Author: Denize Julien
Hérault Romain
Liashuha Mykola
Orcesi Astrid
Rabarisoa Jaonary
Publication venue
Publication date: 03/09/2023
Field of study

We present COMEDIAN, a novel pipeline to initialize spatio-temporal transformers for action spotting, which involves self-supervised learning and knowledge distillation. Action spotting is a timestamp-level temporal action detection task. Our pipeline consists of three steps, with two initialization stages. First, we perform self-supervised initialization of a spatial transformer using short videos as input. Additionally, we initialize a temporal transformer that enhances the spatial transformer's outputs with global context through knowledge distillation from a pre-computed feature bank aligned with each short video segment. In the final step, we fine-tune the transformers to the action spotting task. The experiments, conducted on the SoccerNet-v2 dataset, demonstrate state-of-the-art performance and validate the effectiveness of COMEDIAN's pretraining paradigm. Our results highlight several advantages of our pretraining pipeline, including improved performance and faster convergence compared to non-pretrained models.Comment: Source code is available here: https://github.com/juliendenize/eztorc

arXiv.org e-Print Archive

SoccerNet 2023 Challenges Results

Author: Abdelaziz Amr
Abdelwahed Mohamed
Alahi Alexandre
Ardö Håkan
Baikulov Ruslan
Barnich Olivier
Be'ery Ishay
Chen Chen
Chen Ruilong
Chen Shimin
Choi Gyusik
Cioppa Anthony
Clapés Albert
Dai Wei
De Vleeschouwer Christophe
Deliège Adrien
Denize Julien
Deuser Fabian
Ding Shouhong
Escalera Sergio
Fahrudin Hasby
Falaleev Nikolay
Fu Jiajun
Fukushima Ryuto
Gan Yiyang
Ghanem Bernard
Giancola Silvio
Guo Hao
Habel Konrad
Held Jan
Hinojosa Carlos
Huang Zhijian
Hérault Romain
Jia Qiong
Jiao Licheng
Joo Yeeun
Kamal Abdullah
Kim Hankyul
Kim Juntae
Kobayashi Kenji
Koguchi Hidenari
Lee Jeongae
Lee Seungcheon
Li Junjie
Li Menglong
Li Tianjiao
Li Wei
Li Zhiheng
Liashuha Mykola
Lim Byoungkwon
Liu Bin
Liu Ruixuan
Luo Weixin
Ma Lin
Ma Yanbiao
Magera Floriane
Maglo Adrien
Mansourian Amir M.
Meng Ziyu
Miralles Pierre
Mkhallati Hassan
Moeslund Thomas B.
Muhammad Iftikar
Nakajima Kota
Nang Jongho
Nasr Mohamed
Orcesi Astrid
Oswald Norbert
Peng Rui
Pham Quoc-Cuong
Rabarisoa Jaonary
Ruan Zheng
Salah Ibrahim
Scott Atom
Shen Wei
Shitrit Gal
Somers Vladimir
Someya Taiga
Song Ran
Synowiec Kamil
Uchida Ikuma
Van Droogenbroeck Marc
Wang Guanshuo
Wang Lizhi
Wang Luping
Xarles Artur
Xu Jinghang
Yan Feng
Yang Xinquan
Yerushalmy Ido
Yin Jianqin
Yu Fufu
Zeng Yingsen
Zhang Junpei
Zhang Kexin
Zhang Wei
Zhang Wenjie
Zhao Wending
Zhong Yujie
Zhou Mengying
Zhou Xin
Zhu Yongqiang
Publication venue: Springer Verlag
Publication date: 12/09/2023
Field of study

peer reviewedThe SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on https://www.soccer-net.org. Baselines and development kits can be found on https://github.com/SoccerNet

Open Repository and Bibliography - Liège

COMEDIAN: Self-supervised learning and knowledge distillation for action spotting using transformers

Author: Denize Julien
Hérault Romain
Liashuha Mykola
Orcesi Astrid
Rabarisoa Jaonary
Publication venue: HAL CCSD
Publication date: 04/01/2024
Field of study

International audienceWe present COMEDIAN, a novel pipeline to initialize spatiotemporal transformers for action spotting, which involves self-supervised learning and knowledge distillation. Action spotting is a timestamp-level temporal action detection task. Our pipeline consists of three steps, with two initialization stages. First, we perform self-supervised initialization of a spatial transformer using short videos as input. Additionally, we initialize a temporal transformer that enhances the spatial transformer's outputs with global context through knowledge distillation from a pre-computed feature bank aligned with each short video segment. In the final step, we fine-tune the transformers to the action spotting task. The experiments, conducted on the SoccerNet-v2 dataset, demonstrate state-of-the-art performance and validate the effectiveness of COMEDIAN's pretraining paradigm. Our results highlight several advantages of our pretraining pipeline, including improved performance and faster convergence compared to non-pretrained models

HAL - Normandie Université

HAL-CEA

Self-Supervised Representation Learning using Visual Field Expansion on Digital Pathology

Author: Boyd Joseph
Christodoulidis Stergios
Deutsch Eric
Liashuha Mykola
Paragios Nikos
Vakalopoulou Maria
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/10/2021
Field of study

International audienceThe examination of histopathology images is considered to be the gold standard for the diagnosis and stratification of cancer patients. A key challenge in the analysis of such images is their size, which can run into the gigapixels and can require tedious screening by clinicians. With the recent advances in computational medicine, automatic tools have been proposed to assist clinicians in their everyday practice. Such tools typically process these large images by slicing them into tiles that can then be encoded and utilized for different clinical models. In this study, we propose a novel generative framework that can learn powerful representations for such tiles by learning to plausibly expand their visual field. In particular, we developed a progressively grown generative model with the objective of visual field expansion. Thus trained, our model learns to generate different tissue types with fine details, while simultaneously learning powerful representations that can be used for different clinical endpoints, all in a self-supervised way. To evaluate the performance of our model, we conducted classification experiments on CAMELYON17 and CRC benchmark datasets, comparing favorably to other self-supervised and pre-trained strategies that are commonly used in digital pathology. Our code is available at https://github.com/jcboyd/cdpath21-gan

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

SoccerNet 2023 challenges results

Author: Abdelaziz Amr
Abdelwahed Mohamed
Alahi Alexandre
Ardö Håkan
Baikulov Ruslan
Barnich Olivier
Be'Ery Ishay
Chen Chen
Chen Ruilong
Chen Shimin
Choi Gyusik
Cioppa Anthony
Clapés Albert
Dai Wei
de Vleeschouwer Christophe
Deliège Adrien
Denize Julien
Deuser Fabian
Ding Shouhong
Escalera Sergio
Fahrudin Hasby
Falaleev Nikolay
Fu Jiajun
Fukushima Ryuto
Gan Yiyang
Ghanem Bernard
Giancola Silvio
Guo Hao
Habel Konrad
Held Jan
Hinojosa Carlos
Huang Zhijian
Hérault Romain
Jia Qiong
Jiao Licheng
Joo Yeeun
Kamal Abdullah
Kim Hankyul
Kim Juntae
Kobayashi Kenji
Koguchi Hidenari
Lee Jeongae
Lee Seungcheon
Li Junjie
Li Menglong
Li Tianjiao
Li Wei
Li Zhiheng
Liashuha Mykola
Lim Byoungkwon
Liu Bin
Liu Ruixuan
Luo Weixin
Ma Lin
Ma Yanbiao
Magera Floriane
Maglo Adrien
Mansourian Amir
Meng Ziyu
Miralles Pierre
Mkhallati Hassan
Moeslund Thomas
Muhammad Iftikar
Nakajima Kota
Nang Jongho
Nasr Mohamed
Orcesi Astrid
Oswald Norbert
Peng Rui
Pham Quoc-Cuong
Rabarisoa Jaonary
Ruan Zheng
Salah Ibrahim
Scott Atom
Shen Wei
Shitrit Gal
Somers Vladimir
Someya Taiga
Song Ran
Synowiec Kamil
Uchida Ikuma
van Droogenbroeck Marc
Wang Guanshuo
Wang Lizhi
Wang Luping
Xarles Artur
Xu Jinghang
Yan Feng
Yang Xinquan
Yerushalmy Ido
Yin Jianqin
Yu Fufu
Zeng Yingsen
Zhang Junpei
Zhang Kexin
Zhang Wei
Zhang Wenjie
Zhao Wending
Zhong Yujie
Zhou Mengying
Zhou Xin
Zhu Yongqiang
Publication venue: arXiv
Publication date: 01/01/2023
Field of study

SoccerNet 2023 Challenges ResultsThe SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on https://www.soccer-net.org. Baselines and development kits can be found on https://github.com/SoccerNet

HAL - Normandie Université

HAL-CEA