8,673 research outputs found
Unveiling the multimedia unconscious: implicit cognitive processes and multimedia content analysis
One of the main findings of cognitive sciences is that automatic processes of which we are unaware shape, to a significant extent, our perception of the environment. The phenomenon applies not only to the real world, but also to multimedia data we consume every day. Whenever we look at pictures, watch a video or listen to audio recordings, our conscious attention efforts focus on the observable content, but our cognition spontaneously perceives intentions, beliefs, values, attitudes and other constructs that, while being outside of our conscious awareness, still shape our reactions and behavior. So far, multimedia technologies have neglected such a phenomenon to a large extent. This paper argues that taking into account cognitive effects is possible and it can also improve multimedia approaches. As a supporting proof-of-concept, the paper shows not only that there are visual patterns correlated with the personality traits of 300 Flickr users to a statistically significant extent, but also that the personality traits (both self-assessed and attributed by others) of those users can be inferred from the images these latter post as "favourite"
Predictive Coding for Dynamic Visual Processing: Development of Functional Hierarchy in a Multiple Spatio-Temporal Scales RNN Model
The current paper proposes a novel predictive coding type neural network
model, the predictive multiple spatio-temporal scales recurrent neural network
(P-MSTRNN). The P-MSTRNN learns to predict visually perceived human whole-body
cyclic movement patterns by exploiting multiscale spatio-temporal constraints
imposed on network dynamics by using differently sized receptive fields as well
as different time constant values for each layer. After learning, the network
becomes able to proactively imitate target movement patterns by inferring or
recognizing corresponding intentions by means of the regression of prediction
error. Results show that the network can develop a functional hierarchy by
developing a different type of dynamic structure at each layer. The paper
examines how model performance during pattern generation as well as predictive
imitation varies depending on the stage of learning. The number of limit cycle
attractors corresponding to target movement patterns increases as learning
proceeds. And, transient dynamics developing early in the learning process
successfully perform pattern generation and predictive imitation tasks. The
paper concludes that exploitation of transient dynamics facilitates successful
task performance during early learning periods.Comment: Accepted in Neural Computation (MIT press
Anticipating Daily Intention using On-Wrist Motion Triggered Sensing
Anticipating human intention by observing one's actions has many
applications. For instance, picking up a cellphone, then a charger (actions)
implies that one wants to charge the cellphone (intention). By anticipating the
intention, an intelligent system can guide the user to the closest power
outlet. We propose an on-wrist motion triggered sensing system for anticipating
daily intentions, where the on-wrist sensors help us to persistently observe
one's actions. The core of the system is a novel Recurrent Neural Network (RNN)
and Policy Network (PN), where the RNN encodes visual and motion observation to
anticipate intention, and the PN parsimoniously triggers the process of visual
observation to reduce computation requirement. We jointly trained the whole
network using policy gradient and cross-entropy loss. To evaluate, we collect
the first daily "intention" dataset consisting of 2379 videos with 34
intentions and 164 unique action sequences. Our method achieves 92.68%, 90.85%,
97.56% accuracy on three users while processing only 29% of the visual
observation on average
- …