9,971 research outputs found
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
Recently, substantial research effort has focused on how to apply CNNs or
RNNs to better extract temporal patterns from videos, so as to improve the
accuracy of video classification. In this paper, however, we show that temporal
information, especially longer-term patterns, may not be necessary to achieve
competitive results on common video classification datasets. We investigate the
potential of a purely attention based local feature integration. Accounting for
the characteristics of such features in video classification, we propose a
local feature integration framework based on attention clusters, and introduce
a shifting operation to capture more diverse signals. We carefully analyze and
compare the effect of different attention mechanisms, cluster sizes, and the
use of the shifting operation, and also investigate the combination of
attention clusters for multimodal integration. We demonstrate the effectiveness
of our framework on three real-world video classification datasets. Our model
achieves competitive results across all of these. In particular, on the
large-scale Kinetics dataset, our framework obtains an excellent single model
accuracy of 79.4% in terms of the top-1 and 94.0% in terms of the top-5
accuracy on the validation set. The attention clusters are the backbone of our
winner solution at ActivityNet Kinetics Challenge 2017. Code and models will be
released soon.Comment: The backbone of the winner solution at ActivityNet Kinetics Challenge
201
Exploring Cognitive States: Methods for Detecting Physiological Temporal Fingerprints
Cognitive state detection and its relationship to observable physiologically telemetry has been utilized for many human-machine and human-cybernetic applications. This paper aims at understanding and addressing if there are unique psychophysiological patterns over time, a physiological temporal fingerprint, that is associated with specific cognitive states. This preliminary work involves commercial airline pilots completing experimental benchmark task inductions of three cognitive states: 1) Channelized Attention (CA); 2) High Workload (HW); and 3) Low Workload (LW). We approach this objective by modeling these "fingerprints" through the use of Hidden Markov Models and Entropy analysis to evaluate if the transitions over time are complex or rhythmic/predictable by nature. Our results indicate that cognitive states do have unique complexity of physiological sequences that are statistically different from other cognitive states. More specifically, CA has a significantly higher temporal psychophysiological complexity than HW and LW in EEG and ECG telemetry signals. With regards to respiration telemetry, CA has a lower temporal psychophysiological complexity than HW and LW. Through our preliminary work, addressing this unique underpinning can inform whether these underlying dynamics can be utilized to understand how humans transition between cognitive states and for improved detection of cognitive states
Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements
Emotion evoked by an advertisement plays a key role in influencing brand
recall and eventual consumer choices. Automatic ad affect recognition has
several useful applications. However, the use of content-based feature
representations does not give insights into how affect is modulated by aspects
such as the ad scene setting, salient object attributes and their interactions.
Neither do such approaches inform us on how humans prioritize visual
information for ad understanding. Our work addresses these lacunae by
decomposing video content into detected objects, coarse scene structure, object
statistics and actively attended objects identified via eye-gaze. We measure
the importance of each of these information channels by systematically
incorporating related information into ad affect prediction models. Contrary to
the popular notion that ad affect hinges on the narrative and the clever use of
linguistic and social cues, we find that actively attended objects and the
coarse scene structure better encode affective information as compared to
individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International
Conference on Multimodal Interaction, Boulder, CO, US
Optimal set of EEG features for emotional state classification and trajectory visualization in Parkinson's disease
In addition to classic motor signs and symptoms, individuals with Parkinson's disease (PD) are characterized by emotional deficits. Ongoing brain activity can be recorded by electroencephalograph (EEG) to discover the links between emotional states and brain activity. This study utilized machine-learning algorithms to categorize emotional states in PD patients compared with healthy controls (HC) using EEG. Twenty non-demented PD patients and 20 healthy age-, gender-, and education level-matched controls viewed happiness, sadness, fear, anger, surprise, and disgust emotional stimuli while fourteen-channel EEG was being recorded. Multimodal stimulus (combination of audio and visual) was used to evoke the emotions. To classify the EEG-based emotional states and visualize the changes of emotional states over time, this paper compares four kinds of EEG features for emotional state classification and proposes an approach to track the trajectory of emotion changes with manifold learning. From the experimental results using our EEG data set, we found that (a) bispectrum feature is superior to other three kinds of features, namely power spectrum, wavelet packet and nonlinear dynamical analysis; (b) higher frequency bands (alpha, beta and gamma) play a more important role in emotion activities than lower frequency bands (delta and theta) in both groups and; (c) the trajectory of emotion changes can be visualized by reducing subject-independent features with manifold learning. This provides a promising way of implementing visualization of patient's emotional state in real time and leads to a practical system for noninvasive assessment of the emotional impairments associated with neurological disorders
- …