Search CORE

13 research outputs found

Unified Embedding and Metric Learning for Zero-Exemplar Event Detection

Author: Gavves Efstratios
Hussein Noureldien
Smeulders Arnold W. M.
Publication venue
Publication date: 01/01/2017
Field of study

Event detection in unconstrained videos is conceived as a content-based video retrieval with two modalities: textual and visual. Given a text describing a novel event, the goal is to rank related videos accordingly. This task is zero-exemplar, no video examples are given to the novel event. Related works train a bank of concept detectors on external data sources. These detectors predict confidence scores for test videos, which are ranked and retrieved accordingly. In contrast, we learn a joint space in which the visual and textual representations are embedded. The space casts a novel event as a probability of pre-defined events. Also, it learns to measure the distance between an event and its related videos. Our model is trained end-to-end on publicly available EventNet. When applied to TRECVID Multimedia Event Detection dataset, it outperforms the state-of-the-art by a considerable margin.Comment: IEEE CVPR 201

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Video2vec Embeddings Recognize Events when Examples are Scarce

Author: Habibian A.
Mensink T.
Snoek C.G.M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Video2vec Embeddings Recognize Events when Examples are Scarce

Author: Habibian A.
Mensink T.
Snoek C.G.M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Aspects of time for recognizing human activities

Author: Hussein N.M.E.
Publication venue
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications

Objects for spatio-temporal activity recognition in videos

Author: Mettes P.S.M.
Publication venue
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval

Author: Galanopoulos Damianos
Mezaris Vasileios
Publication venue
Publication date: 21/11/2022
Field of study

In this paper we tackle the cross-modal video retrieval problem and, more specifically, we focus on text-to-video retrieval. We investigate how to optimally combine multiple diverse textual and visual features into feature pairs that lead to generating multiple joint feature spaces, which encode text-video pairs into comparable representations. To learn these representations our proposed network architecture is trained by following a multiple space learning procedure. Moreover, at the retrieval stage, we introduce additional softmax operations for revising the inferred query-video similarities. Extensive experiments in several setups based on three large-scale datasets (IACC.3, V3C1, and MSR-VTT) lead to conclusions on how to best combine text-visual features and document the performance of the proposed network. Source code is made publicly available at: https://github.com/bmezaris/TextToVideoRetrieval-TtimesVComment: Accepted for publication; to be included in Proc. ECCV Workshops 2022. The version posted here is the "submitted manuscript" versio

arXiv.org e-Print Archive