476,124 research outputs found
Describing Videos by Exploiting Temporal Structure
Recent progress in using recurrent neural networks (RNNs) for image
description has motivated the exploration of their application for video
description. However, while images are static, working with videos requires
modeling their dynamic temporal structure and then properly integrating that
information into a natural language description. In this context, we propose an
approach that successfully takes into account both the local and global
temporal structure of videos to produce descriptions. First, our approach
incorporates a spatial temporal 3-D convolutional neural network (3-D CNN)
representation of the short temporal dynamics. The 3-D CNN representation is
trained on video action recognition tasks, so as to produce a representation
that is tuned to human motion and behavior. Second we propose a temporal
attention mechanism that allows to go beyond local temporal modeling and learns
to automatically select the most relevant temporal segments given the
text-generating RNN. Our approach exceeds the current state-of-art for both
BLEU and METEOR metrics on the Youtube2Text dataset. We also present results on
a new, larger and more challenging dataset of paired video and natural language
descriptions.Comment: Accepted to ICCV15. This version comes with code release and
supplementary materia
Dynamical Properties of Interaction Data
Network dynamics are typically presented as a time series of network
properties captured at each period. The current approach examines the dynamical
properties of transmission via novel measures on an integrated, temporally
extended network representation of interaction data across time. Because it
encodes time and interactions as network connections, static network measures
can be applied to this "temporal web" to reveal features of the dynamics
themselves. Here we provide the technical details and apply it to agent-based
implementations of the well-known SEIR and SEIS epidemiological models.Comment: 29 pages, 15 figure
SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition
International audienceDue to the availability of large-scale skeleton datasets, 3D human action recognition has recently called the attention of computer vision community. Many works have fo-cused on encoding skeleton data as skeleton image representations based on spatial structure of the skeleton joints, in which the temporal dynamics of the sequence is encoded as variations in columns and the spatial structure of each frame is represented as rows of a matrix. To further improve such representations, we introduce a novel skeleton image representation to be used as input of Convolutional Neural Networks (CNNs), named SkeleMotion. The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Different temporal scales are employed to compute motion values to aggregate more temporal dynamics to the representation making it able to capture long-range joint interactions involved in actions as well as filtering noisy motion values. Experimental results demonstrate the effectiveness of the proposed representation on 3D action recognition outperforming the state-of-the-art on NTU RGB+D 120 dataset
Tracking uncertainty in a spatially explicit susceptible-infected epidemic model
In this paper we conceive an interval-valued continuous cellular automaton for describing the spatio-temporal dynamics of an epidemic, in which the magnitude of the initial outbreak and/or the epidemic properties are only imprecisely known. In contrast to well-established approaches that rely on probability distributions for keeping track of the uncertainty in spatio-temporal models, we resort to an interval representation of uncertainty. Such an approach lowers the amount of computing power that is needed to run model simulations, and reduces the need for data that are indispensable for constructing the probability distributions upon which other paradigms are based
Discrimination of moderate and acute drowsiness based on spontaneous facial expressions
It is important for drowsiness detection systems to identify different levels of drowsiness and respond appropriately at each level. This study explores how to
discriminate moderate from acute drowsiness by applying computer vision techniques to the human face. In our previous study, spontaneous facial expressions measured through computer vision techniques were used as an indicator to discriminate alert from acutely drowsy episodes. In this study we are exploring which facial muscle movements are predictive of moderate
and acute drowsiness. The effect of temporal dynamics of action units on prediction performances is explored by capturing temporal dynamics using an overcomplete representation of temporal Gabor Filters. In the final system we perform feature selection to build a classifier that can discriminate moderate drowsy from acute drowsy episodes. The system achieves a classification
rate of .96 A’ in discriminating moderately drowsy versus acutely drowsy episodes. Moreover the study reveals new information in facial behavior occurring during different stages of drowsiness
- …