Search CORE

67,644 research outputs found

Efficient Human Activity Recognition in Large Image and Video Databases

Author: Cheema Muhammad Shahzad
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Vision-based human action recognition has attracted considerable interest in recent research for its applications to video surveillance, content-based search, healthcare, and interactive games. Most existing research deals with building informative feature descriptors, designing efficient and robust algorithms, proposing versatile and challenging datasets, and fusing multiple modalities. Often, these approaches build on certain conventions such as the use of motion cues to determine video descriptors, application of off-the-shelf classifiers, and single-factor classification of videos. In this thesis, we deal with important but overlooked issues such as efficiency, simplicity, and scalability of human activity recognition in different application scenarios: controlled video environment (e.g.~indoor surveillance), unconstrained videos (e.g.~YouTube), depth or skeletal data (e.g.~captured by Kinect), and person images (e.g.~Flicker). In particular, we are interested in answering questions like (a) is it possible to efficiently recognize human actions in controlled videos without temporal cues? (b) given that the large-scale unconstrained video data are often of high dimension low sample size (HDLSS) nature, how to efficiently recognize human actions in such data? (c) considering the rich 3D motion information available from depth or motion capture sensors, is it possible to recognize both the actions and the actors using only the motion dynamics of underlying activities? and (d) can motion information from monocular videos be used for automatically determining saliency regions for recognizing actions in still images

bonndoc – Der Publikationsserver der Universität Bonn

Apparent sharpness of 3D video when one eye's view is more blurry.

Author: Jain Ankit
Macleod Don
Nguyen Truong
Robinson Alan
Scott Mathew
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

When the images presented to each eye differ in sharpness, the fused percept remains relatively sharp. Here, we measure this effect by showing stereoscopic videos that have been blurred for one eye, or both eyes, and psychophysically determining when they appear equally sharp. For a range of blur magnitudes, the fused percept always appeared significantly sharper than the blurrier view. From these data, we investigate to what extent discarding high spatial frequencies from just one eye's view reduces the bandwidth necessary to transmit perceptually sharp 3D content. We conclude that relatively high-resolution video transmission has the most potential benefit from this method

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Neural correlates of sexual cue reactivity in individuals with and without compulsive sexual behaviours

Author: A Ferretti
A Galvan
A Kor
A Shenhav
AJ Shackman
AR Childress
AS Euser
AT Beck
BA Arnow
BL Odlaug
BL Odlaug
BY Hayden
CA McGahuey
CA Warren
D Martinez
D Weintraub
DC Perry
DE Linden
DG Smith
DI Lubman
DL Delmonico
DV Sheehan
ER Sowell
G Sescousse
GJ Meerkerk
GK Murray
H Kataoka
H Mouras
IH Franken
IH Franken
J Redoute
JA Maldjian
JB Saunders
JD Wallis
JE Grant
JE Grant
JP Dunning
Judy Karr
KS Young
Laura Porter
Laurel Morris
LH Somerville
M Bocher
M Heinze
M Politis
Marc N. Potenza
MC van de Laar
MF Rushworth
MH Miner
Michael Irvine
MP Kafka
Neil A. Harrison
NM Petry
Paula Banca
PH Rudebeck
RA Chambers
RA Poldrack
RC Reid
RC Reid
S Hamann
S Kuhn
S Kuhn
S Kuhn
S Stoleru
Simon Mitchell
SM Williams
SP Whiteside
T Paul
Tatyana R. Lapa
TE Robinson
Thomas B. Mole
TW Fong
V Voon
Valerie Voon
Veronique Sgambato-Faure
VR Steele
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/07/2014
Field of study

Although compulsive sexual behaviour (CSB) has been conceptualized as a "behavioural" addiction and common or overlapping neural circuits may govern the processing of natural and drug rewards, little is known regarding the responses to sexually explicit materials in individuals with and without CSB. Here, the processing of cues of varying sexual content was assessed in individuals with and without CSB, focusing on neural regions identified in prior studies of drug-cue reactivity. 19 CSB subjects and 19 healthy volunteers were assessed using functional MRI comparing sexually explicit videos with non-sexual exciting videos. Ratings of sexual desire and liking were obtained. Relative to healthy volunteers, CSB subjects had greater desire but similar liking scores in response to the sexually explicit videos. Exposure to sexually explicit cues in CSB compared to non-CSB subjects was associated with activation of the dorsal anterior cingulate, ventral striatum and amygdala. Functional connectivity of the dorsal anterior cingulate-ventral striatum-amygdala network was associated with subjective sexual desire (but not liking) to a greater degree in CSB relative to non-CSB subjects. The dissociation between desire or wanting and liking is consistent with theories of incentive motivation underlying CSB as in drug addictions. Neural differences in the processing of sexual-cue reactivity were identified in CSB subjects in regions previously implicated in drug-cue reactivity studies. The greater engagement of corticostriatal limbic circuitry in CSB following exposure to sexual cues suggests neural mechanisms underlying CSB and potential biological targets for interventions

Public Library of Science (PLOS)

Crossref

Online Research @ Cardiff

Directory of Open Access Journals

PubMed Central

Sussex Research Online

FigShare

Spontaneous Subtle Expression Detection and Recognition based on Facial Strain

Author: Liong Sze-Teng
Ngo Anh Cat Le
Oh Yee-Hui
Phan Raphael Chung-Wei
See John
Tan Su-Wei
Wong KokSheik
Publication venue: 'Elsevier BV'
Publication date: 08/06/2016
Field of study

Optical strain is an extension of optical flow that is capable of quantifying subtle changes on faces and representing the minute facial motion intensities at the pixel level. This is computationally essential for the relatively new field of spontaneous micro-expression, where subtle expressions can be technically challenging to pinpoint. In this paper, we present a novel method for detecting and recognizing micro-expressions by utilizing facial optical strain magnitudes to construct optical strain features and optical strain weighted features. The two sets of features are then concatenated to form the resultant feature histogram. Experiments were performed on the CASME II and SMIC databases. We demonstrate on both databases, the usefulness of optical strain information and more importantly, that our best approaches are able to outperform the original baseline results for both detection and recognition tasks. A comparison of the proposed method with other existing spatio-temporal feature extraction approaches is also presented.Comment: 21 pages (including references), single column format, accepted to Signal Processing: Image Communication journa

arXiv.org e-Print Archive

Deakin Research Online

Heriot Watt Pure

SHDL@MMU Digital Repository

Digging Deeper into Egocentric Gaze Prediction

Author: Borji Ali
Kannala Juho
Rahtu Esa
Tavakoli Hamed R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/04/2019
Field of study

This paper digs deeper into factors that influence egocentric gaze. Instead of training deep models for this purpose in a blind manner, we propose to inspect factors that contribute to gaze guidance during daily tasks. Bottom-up saliency and optical flow are assessed versus strong spatial prior baselines. Task-specific cues such as vanishing point, manipulation point, and hand regions are analyzed as representatives of top-down information. We also look into the contribution of these factors by investigating a simple recurrent neural model for ego-centric gaze prediction. First, deep features are extracted for all input video frames. Then, a gated recurrent unit is employed to integrate information over time and to predict the next fixation. We also propose an integrated model that combines the recurrent model with several top-down and bottom-up cues. Extensive experiments over multiple datasets reveal that (1) spatial biases are strong in egocentric videos, (2) bottom-up saliency models perform poorly in predicting gaze and underperform spatial biases, (3) deep features perform better compared to traditional features, (4) as opposed to hand regions, the manipulation point is a strong influential cue for gaze prediction, (5) combining the proposed recurrent model with bottom-up cues, vanishing points and, in particular, manipulation point results in the best gaze prediction accuracy over egocentric videos, (6) the knowledge transfer works best for cases where the tasks or sequences are similar, and (7) task and activity recognition can benefit from gaze prediction. Our findings suggest that (1) there should be more emphasis on hand-object interaction and (2) the egocentric vision community should consider larger datasets including diverse stimuli and more subjects.Comment: presented at WACV 201

arXiv.org e-Print Archive

Crossref

Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

Author: Kankanhalli Mohan
Katti Harish
Shukla Abhinav
Subramanian Ramanathan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/08/2018
Field of study

Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

arXiv.org e-Print Archive

University of Canberra Research Repository

Open Access Repository of IISc Research Publications