11,090 research outputs found
Long-term learning behavior in a recurrent neural network for sound recognition
In this paper, the long-term learning properties of an artificial neural network model, designed for sound recognition and computational auditory scene analysis in general, are investigated. The model is designed to run for long periods of time (weeks to months) on low-cost hardware, used in a noise monitoring network, and builds upon previous work by the same authors. It consists of three neural layers, connected to each other by feedforward and feedback excitatory connections. It is shown that the different mechanisms that drive auditory attention emerge naturally from the way in which neural activation and intra-layer inhibitory connections are implemented in the model. Training of the artificial neural network is done following the Hebb principle, dictating that "Cells that fire together, wire together", with some important modifications, compared to standard Hebbian learning. As the model is designed to be on-line for extended periods of time, also learning mechanisms need to be adapted to this. The learning needs to be strongly attention-and saliency-driven, in order not to waste available memory space for sounds that are of no interest to the human listener. The model also implements plasticity, in order to deal with new or changing input over time, without catastrophically forgetting what it already learned. On top of that, it is shown that also the implementation of shortterm memory plays an important role in the long-term learning properties of the model. The above properties are investigated and demonstrated by training on real urban sound recordings
Computing driver tiredness and fatigue in automobile via eye tracking and body movements
The aim of this paper is to classify the driver tiredness and fatigue in automobile via eye tracking and body movements using deep learning based Convolutional Neural Network (CNN) algorithm. Vehicle driver face localization serves as one of the most widely used real-world applications in fields like toll control, traffic accident scene analysis, and suspected vehicle tracking. The research proposed a CNN classifier for simultaneously localizing the region of human face and eye positioning. The classifier, rather than bounding rectangles, gives bounding quadrilaterals, which gives a more precise indication for vehicle driver face localization. The adjusted regions are preprocessed to remove noise and passed to the CNN classifier for real time processing. The preprocessing of the face features extracts connected components, filters them by size, and groups them into face expressions. The employed CNN is the well-known technology for human face recognition. One we aim to extract the facial landmarks from the frames, we will then leverage classification models and deep learning based convolutional neural networks that predict the state of the driver as 'Alert' or 'Drowsy' for each of the frames extracted. The CNN model could predict the output state labels (Alert/Drowsy) for each frame, but we wanted to take care of sequential image frames as that is extremely important while predicting the state of an individual. The process completes, if all regions have a sufficiently high score or a fixed number of retries are exhausted. The output consists of the detected human face type, the list of regions including the extracted mouth and eyes with recognition reliability through CNN with an accuracy of 98.57% with 100 epochs of training and testing
Attention Allocation Aid for Visual Search
This paper outlines the development and testing of a novel, feedback-enabled
attention allocation aid (AAAD), which uses real-time physiological data to
improve human performance in a realistic sequential visual search task. Indeed,
by optimizing over search duration, the aid improves efficiency, while
preserving decision accuracy, as the operator identifies and classifies targets
within simulated aerial imagery. Specifically, using experimental eye-tracking
data and measurements about target detectability across the human visual field,
we develop functional models of detection accuracy as a function of search
time, number of eye movements, scan path, and image clutter. These models are
then used by the AAAD in conjunction with real time eye position data to make
probabilistic estimations of attained search accuracy and to recommend that the
observer either move on to the next image or continue exploring the present
image. An experimental evaluation in a scenario motivated from human
supervisory control in surveillance missions confirms the benefits of the AAAD.Comment: To be presented at the ACM CHI conference in Denver, Colorado in May
201
Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements
Emotion evoked by an advertisement plays a key role in influencing brand
recall and eventual consumer choices. Automatic ad affect recognition has
several useful applications. However, the use of content-based feature
representations does not give insights into how affect is modulated by aspects
such as the ad scene setting, salient object attributes and their interactions.
Neither do such approaches inform us on how humans prioritize visual
information for ad understanding. Our work addresses these lacunae by
decomposing video content into detected objects, coarse scene structure, object
statistics and actively attended objects identified via eye-gaze. We measure
the importance of each of these information channels by systematically
incorporating related information into ad affect prediction models. Contrary to
the popular notion that ad affect hinges on the narrative and the clever use of
linguistic and social cues, we find that actively attended objects and the
coarse scene structure better encode affective information as compared to
individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International
Conference on Multimodal Interaction, Boulder, CO, US
Machine Analysis of Facial Expressions
No abstract
- …