Search CORE

4,567 research outputs found

Finding Temporally Consistent Occlusion Boundaries in Videos using Geometric Context

Author: Anderson David
Essa Irfan
Grundmann Matthias
Humayun Ahmad
Raza S. Hussain
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/10/2015
Field of study

We present an algorithm for finding temporally consistent occlusion boundaries in videos to support segmentation of dynamic scenes. We learn occlusion boundaries in a pairwise Markov random field (MRF) framework. We first estimate the probability of an spatio-temporal edge being an occlusion boundary by using appearance, flow, and geometric features. Next, we enforce occlusion boundary continuity in a MRF model by learning pairwise occlusion probabilities using a random forest. Then, we temporally smooth boundaries to remove temporal inconsistencies in occlusion boundary estimation. Our proposed framework provides an efficient approach for finding temporally consistent occlusion boundaries in video by utilizing causality, redundancy in videos, and semantic layout of the scene. We have developed a dataset with fully annotated ground-truth occlusion boundaries of over 30 videos ($5000 frames). This dataset is used to evaluate temporal occlusion boundaries and provides a much needed baseline for future studies. We perform experiments to demonstrate the role of scene layout, and temporal information for occlusion reasoning in dynamic scenes.Comment: Applications of Computer Vision (WACV), 2015 IEEE Winter Conference o

arXiv.org e-Print Archive

Crossref

Depth map compression via 3D region-based representation

Author: Maceira Marc
Morros Rubió Josep Ramon
Ruiz Hidalgo Javier
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In 3D video, view synthesis is used to create new virtual views between encoded camera views. Errors in the coding of the depth maps introduce geometry inconsistencies in synthesized views. In this paper, a new 3D plane representation of the scene is presented which improves the performance of current standard video codecs in the view synthesis domain. Two image segmentation algorithms are proposed for generating a color and depth segmentation. Using both partitions, depth maps are segmented into regions without sharp discontinuities without having to explicitly signal all depth edges. The resulting regions are represented using a planar model in the 3D world scene. This 3D representation allows an efficient encoding while preserving the 3D characteristics of the scene. The 3D planes open up the possibility to code multiview images with a unique representation.Postprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Multi-Scale 3D Scene Flow from Binocular Stereo Sequences

Author: Li Rui
Sclaroff Stan
Publication venue: Boston University Computer Science Department
Publication date: 01/01/2007
Field of study

Scene ﬂow methods estimate the three-dimensional motion ﬁeld for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene ﬂow estimation that provides reliable results using only two cameras by fusing stereo and optical ﬂow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical ﬂow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene ﬂow than previous methods allow. To handle the aperture problems inherent in the estimation of optical ﬂow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108

CiteSeerX

Boston University Institutional Repository (OpenBU)

Texture Segregation By Visual Cortex: Perceptual Grouping, Attention, and Learning

Author: Ahissar
Arivazhagan
Beck
Beck
Ben-Shahar
Bergen
Bergen
Biederman
Biederman
Blaser
Bovik
Bradski
Brodatz
Bullier
Caelli
Caelli
Callaway
Cao
Carpenter
Carpenter
Carpenter
Carpenter
Carpenter
Carpenter
Carpenter
Carpenter
Carpenter
Carpenter
Carpenter
Cavanagh
Cavanagh
Chellappa
Cohen
Colby
Connor
Connor
Corbetta
Cross
Desimone
Deubel
Duncan
Elder
Fazl
Felleman
Ferster
Field
Fogel
Gail A. Carpenter
Gove
Graham
Greenspan
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Grossberg
Guillery
Gurnsey
Hirsch
Hochstein
Hodgkin
Hubel
Hubel
Hubel
Hupé
Jain
Johnson
Julesz
Kapadia
Kellman
Kellman
Kelly
Knierim
Krumm
Lamme
Lamme
Lee
Malik
Malik
Manjunath
Mao
McGuire
Mirmehdi
Mitchell
Munoz
Murphy
Nothdurft
Nothdurft
Nothdurft
Nothdurft
Nothdurft
Olson
O’Craven
Paragios
Posner
Przybyszewski
Pylyshyn
Pylyshyn
Raizada
Raizada
Randen
Rao
Renninger
Reynolds
Reynolds
Reynolds
Roelfsema
Roska
Ross
Rushi Bhatt
Sagi
Salin
Shaw
Sigman
Sillito
Sillito
Sillito
Stephen Grossberg
Sutter
Thielscher
Treisman
Tse
Tyler
von der Heydt
von der Heydt
Watanabe
Wilkinson
Williamson
Wiser
Wolfe
Wolfson
Wolfson
Yeshurun
Zhu
Zipser
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/01/2006
Field of study

A neural model is proposed of how laminar interactions in the visual cortex may learn and recognize object texture and form boundaries. The model brings together five interacting processes: region-based texture classification, contour-based boundary grouping, surface filling-in, spatial attention, and object attention. The model shows how form boundaries can determine regions in which surface filling-in occurs; how surface filling-in interacts with spatial attention to generate a form-fitting distribution of spatial attention, or attentional shroud; how the strongest shroud can inhibit weaker shrouds; and how the winning shroud regulates learning of texture categories, and thus the allocation of object attention. The model can discriminate abutted textures with blurred boundaries and is sensitive to texture boundary attributes like discontinuities in orientation and texture flow curvature as well as to relative orientations of texture elements. The model quantitatively fits a large set of human psychophysical data on orientation-based textures. Object boundar output of the model is compared to computer vision algorithms using a set of human segmented photographic images. The model classifies textures and suppresses noise using a multiple scale oriented filterbank and a distributed Adaptive Resonance Theory (dART) classifier. The matched signal between the bottom-up texture inputs and top-down learned texture categories is utilized by oriented competitive and cooperative grouping processes to generate texture boundaries that control surface filling-in and spatial attention. Topdown modulatory attentional feedback from boundary and surface representations to early filtering stages results in enhanced texture boundaries and more efficient learning of texture within attended surface regions. Surface-based attention also provides a self-supervising training signal for learning new textures. Importance of the surface-based attentional feedback in texture learning and classification is tested using a set of textured images from the Brodatz micro-texture album. Benchmark studies vary from 95.1% to 98.6% with attention, and from 90.6% to 93.2% without attention.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-01-1-0423); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

CiteSeerX

Elsevier - Publisher Connector

Crossref

Boston University Institutional Repository (OpenBU)

A survey of visual preprocessing and shape representation techniques

Author: Olshausen Bruno A.
Publication venue
Publication date
Field of study

Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

NASA Technical Reports Server

A spatially distributed model for foreground segmentation

Author: Appiah Kofi
Dickinson Patrick
Hunter Andrew
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Foreground segmentation is a fundamental first processing stage for vision systems which monitor real-world activity. In this paper we consider the problem of achieving robust segmentation in scenes where the appearance of the background varies unpredictably over time. Variations may be caused by processes such as moving water, or foliage moved by wind, and typically degrade the performance of standard per-pixel background models. Our proposed approach addresses this problem by modeling homogeneous regions of scene pixels as an adaptive mixture of Gaussians in color and space. Model components are used to represent both the scene background and moving foreground objects. Newly observed pixel values are probabilistically classified, such that the spatial variance of the model components supports correct classification even when the background appearance is significantly distorted. We evaluate our method over several challenging video sequences, and compare our results with both per-pixel and Markov Random Field based models. Our results show the effectiveness of our approach in reducing incorrect classifications

University of Lincoln Institutional Repository

Crossref

Nottingham Trent Institutional Repository (IRep)

Sheffield Hallam University Research Archive