Search CORE

2,327 research outputs found

CES-510 Towards Multimodal Human-Machine Interface for Hands-free Control: A survey

Author: Hu H
Wei L
Publication venue: CES-510
Publication date: 01/01/2011
Field of study

University of Essex Research Repository

CiteSeerX

Cortical Dynamics of 3-D Surface Perception: Binocular and Half-Occluded Scenic Images

Author: Grossberg Stephen
McLoughlin Niall
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/07/1995
Field of study

Previous models of stereopsis have concentrated on the task of binocularly matching left and right eye primitives uniquely. A disparity smoothness constraint is often invoked to limit the number of possible matches. These approaches neglect the fact that surface discontinuities are both abundant in natural everyday scenes, and provide a useful cue for scene segmentation. da Vinci stereopsis refers to the more general problem of dealing with surface discontinuities and their associated unmatched monocular regions within binocular scenes. This study develops a mathematical realization of a neural network theory of biological vision, called FACADE Theory, that shows how early cortical stereopsis processes are related to later cortical processes of 3-D surface representation. The mathematical model demonstrates through computer simulation how the visual cortex may generate 3-D boundary segmentations and use them to control filling-in of 3-D surface properties in response to visual scenes. Model mechanisms correctly match disparate binocular regions while filling-in monocular regions with the correct depth within a binocularly viewed scene. This achievement required introduction of a new multiscale binocular filter for stereo matching which clarifies how cortical complex cells match image contours of like contrast polarity, while pooling signals from opposite contrast polarities. Competitive interactions among filter cells suggest how false binocular matches and unmatched monocular cues, which contain eye-of-origin information, arc automatically handled across multiple spatial scales. This network also helps to explain data concerning context-sensitive binocular matching. Pooling of signals from even-symmetric and odd-symmctric simple cells at complex cells helps to eliminate spurious activity peaks in matchable signals. Later stages of cortical processing by the blob and interblob streams, including refined concepts of cooperative boundary grouping and reciprocal stream interactions between boundary and surface representations, arc modeled to provide a complete simulation of the da Vinci stereopsis percept.Office of Naval Research (N00014-95-I-0409, N00014-85-1-0657, N00014-92-J-4015, N00014-91-J-4100); Airforce Office of Scientific Research (90-0175); National Science Foundation (IRI-90-00530); The James S. McDonnell Foundation (94-40

Boston University Institutional Repository (OpenBU)

Recommended from our members

Naturalistic depth perception

Author: McCann Brian Clark
Publication venue
Publication date: 12/10/2015
Field of study

textMaking inferences about the 3-dimensional spatial structure of natural scenes is a critical visual function. While spatial discrimination both in depth and on the image plane has been well characterized for simple stimuli, little is known about our ability to discriminate depth in natural scenes, particularly at far distances. To begin filling in this gap we: (i) developed a database of 80 stereoscopic images paired with the corresponding measured distance information, (ii) used these scenes as psychophysical stimuli and measured near-far discrimination acuity in 4 observers as a function of distance and the visual angle separating the targets, (iii) made additional measurements under patched-eye (monocular) viewing conditions to evaluate the importance of binocular vision in depth discrimination as a function of viewing geometries. We find that binocular thresholds are roughly a constant Weber fraction of the distance for absolute distances ranging from 4 to 28 meters. Further, measured thresholds were around 1% for small separations, and increased to 4% for stimuli separated by 10 deg. Thus, the ability to discriminate depth in natural scenes is very good out to considerable distances. To investigate the basis of this discrimination ability, monocular thresholds were measured. We found that monocular thresholds were elevated for distances less than 15 meters, but were comparable to binocular thresholds for greater distances. Accurate depth perception depends on combining (fusing) multiple sources of sensory information. Thus binocular thresholds probably involve fusing separate monocular and disparity-derived estimates. Under the assumption of Gaussian distributed independent estimates, Bayes rule provides a simple reliability-weighted summation model of cue combination. Using disparity threshold measurements by Blakemore (1970), and the current monocular thresholds, parameter-free predictions were generated for the current binocular thresholds. These predictions were in broad agreement with the data, suggesting that the disparity and monocular cues are separable and combined optimally in natural scenes.Psycholog

Texas ScholarWorks

Vision, Action, and Make-Perceive

Author: Briscoe Professor Robert
Publication venue: Blackwell
Publication date: 01/01/2008
Field of study

In this paper, I critically assess the enactive account of visual perception recently defended by Alva Noë (2004). I argue inter alia that the enactive account falsely identifies an object’s apparent shape with its 2D perspectival shape; that it mistakenly assimilates visual shape perception and volumetric object recognition; and that it seriously misrepresents the constitutive role of bodily action in visual awareness. I argue further that noticing an object’s perspectival shape involves a hybrid experience combining both perceptual and imaginative elements – an act of what I call ‘make-perceive.

PhilPapers

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

Model-Based Environmental Visual Perception for Humanoid Robots

Author: Gonzalez Aguirre David Israel
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2013
Field of study

The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

KITopen

First impressions: A survey on vision-based apparent personality trait analysis

Author: Andújar Gran Carlos Antonio
Baró Solé Xavier
Escalante Balderas Hugo Jair
Escalera Guerrero Sergio
Guyon Isabelle
Güçlü Umut
Güçlütürk Yagmur
Jacques Junior Julio
Pérez Quintana Marc
van Gerven Marcel A. J.
van Lier Rob
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

VBN

Radboud Repository

Lucid Data Dreaming for Video Object Segmentation

Author: Benenson Rodrigo
Brox Thomas
Ilg Eddy
Khoreva Anna
Schiele Bernt
Publication venue
Publication date: 01/01/2019
Field of study

Convolutional networks reach top quality in pixel-level video object segmentation but require a large amount of training data (1k~100k) to deliver such results. We propose a new training strategy which achieves state-of-the-art results across three evaluation datasets while using 20x~1000x less annotated data than competing methods. Our approach is suitable for both single and multiple object segmentation. Instead of using large training sets hoping to generalize across domains, we generate in-domain training data using the provided annotation on the first frame of each video to synthesize ("lucid dream") plausible future video frames. In-domain per-video training data allows us to train high quality appearance- and motion-based models, as well as tune the post-processing stage. This approach allows to reach competitive results even when training from only a single annotated frame, without ImageNet pre-training. Our results indicate that using a larger training set is not automatically better, and that for the video object segmentation task a smaller training set that is closer to the target domain is more effective. This changes the mindset regarding how many training samples and general "objectness" knowledge are required for the video object segmentation task.Comment: Accepted in International Journal of Computer Vision (IJCV

arXiv.org e-Print Archive

MPG.PuRe