Search CORE

8 research outputs found

Unified Embedding and Metric Learning for Zero-Exemplar Event Detection

Author: Gavves Efstratios
Hussein Noureldien
Smeulders Arnold W. M.
Publication venue
Publication date: 01/01/2017
Field of study

Event detection in unconstrained videos is conceived as a content-based video retrieval with two modalities: textual and visual. Given a text describing a novel event, the goal is to rank related videos accordingly. This task is zero-exemplar, no video examples are given to the novel event. Related works train a bank of concept detectors on external data sources. These detectors predict confidence scores for test videos, which are ranked and retrieved accordingly. In contrast, we learn a joint space in which the visual and textual representations are embedded. The space casts a novel event as a probability of pre-defined events. Also, it learns to measure the distance between an event and its related videos. Our model is trained end-to-end on publicly available EventNet. When applied to TRECVID Multimedia Event Detection dataset, it outperforms the state-of-the-art by a considerable margin.Comment: IEEE CVPR 201

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

VideoGraph: Recognizing Minutes-Long Human Activities in Videos

Author: Gavves Efstratios
Hussein Noureldien
Smeulders Arnold W. M.
Publication venue
Publication date: 13/10/2019
Field of study

Many human activities take minutes to unfold. To represent them, related works opt for statistical pooling, which neglects the temporal structure. Others opt for convolutional methods, as CNN and Non-Local. While successful in learning temporal concepts, they are short of modeling minutes-long temporal dependencies. We propose VideoGraph, a method to achieve the best of two worlds: represent minutes-long human activities and learn their underlying temporal structure. VideoGraph learns a graph-based representation for human activities. The graph, its nodes and edges are learned entirely from video datasets, making VideoGraph applicable to problems without node-level annotation. The result is improvements over related works on benchmarks: Epic-Kitchen and Breakfast. Besides, we demonstrate that VideoGraph is able to learn the temporal structure of human activities in minutes-long videos

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

VideoGraph: Recognizing Minutes-Long Human Activities in Videos

Author: Gavves E.
Hussein N.
Smeulders A.W.M.
Publication venue
Publication date: 13/10/2019
Field of study

International Migration, Integration and Social Cohesion online publications

Coarse Temporal Attention Network (CTA-Net) for Driver’s Activity Recognition

Author: BEHERA ARDHENDU
BESSIS NIKOLAOS
LIU YONGHUAI
Wharton Zachary
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/01/2021
Field of study

There is significant progress in recognizing traditional human activities from videos focusing on highly distinctive actions involving discriminative body movements, body-object and/or human-human interactions. Driver's activities are different since they are executed by the same subject with similar body parts movements, resulting in subtle changes. To address this, we propose a novel framework by exploiting the spatiotemporal attention to model the subtle changes. Our model is named Coarse Temporal Attention Network (CTA-Net), in which coarse temporal branches are introduced in a trainable glimpse network. The goal is to allow the glimpse to capture high-level temporal relationships, such as 'during', 'before' and 'after' by focusing on a specific part of a video. These branches also respect the topology of the temporal dynamics in the video, ensuring that different branches learn meaningful spatial and temporal changes. The model then uses an innovative attention mechanism to generate high-level action specific contextual information for activity recognition by exploring the hidden states of an LSTM. The attention mechanism helps in learning to decide the importance of each hidden state for the recognition task by weighing them when constructing the representation of the video. Our approach is evaluated on four publicly accessible datasets and significantly outperforms the state-of-the-art by a considerable margin with only RGB video as input.Comment: Extended version of the accepted WACV 202

arXiv.org e-Print Archive

Edge Hill University Research Information Repository

Timeception for Complex Action Recognition

Author: Gavves Efstratios
Hussein Noureldien
Smeulders Arnold W. M.
Publication venue
Publication date: 01/01/2019
Field of study

This paper focuses on the temporal aspect for recognizing human activities in videos; an important visual cue that has long been undervalued. We revisit the conventional definition of activity and restrict it to Complex Action: a set of one-actions with a weak temporal pattern that serves a specific purpose. Related works use spatiotemporal 3D convolutions with fixed kernel size, too rigid to capture the varieties in temporal extents of complex actions, and too short for long-range temporal modeling. In contrast, we use multi-scale temporal convolutions, and we reduce the complexity of 3D convolutions. The outcome is Timeception convolution layers, which reasons about minute-long temporal patterns, a factor of 8 longer than best related works. As a result, Timeception achieves impressive accuracy in recognizing the human activities of Charades, Breakfast Actions, and MultiTHUMOS. Further, we demonstrate that Timeception learns long-range temporal dependencies and tolerate temporal extents of complex actions.Comment: IEEE CVPR 2019 (Oral

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

Aspects of time for recognizing human activities

Author: Hussein N.M.E.
Publication venue
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications