7,531 research outputs found
Segmenting Foreground Objects from a Dynamic Textured Background via a Robust Kalman Filter
The algorithm presented in this paper aims to segment the foreground objects in video (e.g., people) given time-varying, textured backgrounds. Examples of time-varying backgrounds include waves on water, clouds moving, trees waving in the wind, automobile traffic, moving crowds, escalators, etc. We have developed a novel foreground-background segmentation algorithm that explicitly accounts for the non-stationary nature and clutter-like appearance of many dynamic textures. The dynamic texture is modeled by an Autoregressive Moving Average Model (ARMA). A robust Kalman filter algorithm iteratively estimates the intrinsic appearance of the dynamic texture, as well as the regions of the foreground objects. Preliminary experiments with this method have demonstrated promising results
Multi-Scale 3D Scene Flow from Binocular Stereo Sequences
Scene ïŹow methods estimate the three-dimensional motion ïŹeld for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene ïŹow estimation that provides reliable results using only two cameras by fusing stereo and optical ïŹow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical ïŹow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene ïŹow than previous methods allow. To handle the aperture problems inherent in the estimation of optical ïŹow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization â two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108
Joint Material and Illumination Estimation from Photo Sets in the Wild
Faithful manipulation of shape, material, and illumination in 2D Internet
images would greatly benefit from a reliable factorization of appearance into
material (i.e., diffuse and specular) and illumination (i.e., environment
maps). On the one hand, current methods that produce very high fidelity
results, typically require controlled settings, expensive devices, or
significant manual effort. To the other hand, methods that are automatic and
work on 'in the wild' Internet images, often extract only low-frequency
lighting or diffuse materials. In this work, we propose to make use of a set of
photographs in order to jointly estimate the non-diffuse materials and sharp
lighting in an uncontrolled setting. Our key observation is that seeing
multiple instances of the same material under different illumination (i.e.,
environment), and different materials under the same illumination provide
valuable constraints that can be exploited to yield a high-quality solution
(i.e., specular materials and environment illumination) for all the observed
materials and environments. Similar constraints also arise when observing
multiple materials in a single environment, or a single material across
multiple environments. The core of this approach is an optimization procedure
that uses two neural networks that are trained on synthetic images to predict
good gradients in parametric space given observation of reflected light. We
evaluate our method on a range of synthetic and real examples to generate
high-quality estimates, qualitatively compare our results against
state-of-the-art alternatives via a user study, and demonstrate
photo-consistent image manipulation that is otherwise very challenging to
achieve
Real-Time Human Motion Capture with Multiple Depth Cameras
Commonly used human motion capture systems require intrusive attachment of
markers that are visually tracked with multiple cameras. In this work we
present an efficient and inexpensive solution to markerless motion capture
using only a few Kinect sensors. Unlike the previous work on 3d pose estimation
using a single depth camera, we relax constraints on the camera location and do
not assume a co-operative user. We apply recent image segmentation techniques
to depth images and use curriculum learning to train our system on purely
synthetic data. Our method accurately localizes body parts without requiring an
explicit shape model. The body joint locations are then recovered by combining
evidence from multiple views in real-time. We also introduce a dataset of ~6
million synthetic depth frames for pose estimation from multiple cameras and
exceed state-of-the-art results on the Berkeley MHAD dataset.Comment: Accepted to computer robot vision 201
A survey on 2d object tracking in digital video
This paper presents object tracking methods in video.Different algorithms based on rigid, non rigid and articulated object tracking are studied. The goal of this article is to review the state-of-the-art tracking methods, classify them
into different categories, and identify new trends.It is often the case that tracking objects in consecutive frames is supported by a prediction scheme. Based on information extracted from previous frames and any high level information that can be obtained, the state (location) of the
object is predicted.An excellent framework for prediction is kalman filter, which additionally estimates prediction error.In complex scenes, instead of single hypothesis, multiple hypotheses using Particle filter can be used.Different
techniques are given for different types of constraints in video
Analysis of Retinal Image Data to Support Glaucoma Diagnosis
Fundus kamera je ĆĄiroce dostupnĂ© zobrazovacĂ zaĆĂzenĂ, kterĂ© umoĆŸĆuje relativnÄ rychlĂ© a nenĂĄkladnĂ© vyĆĄetĆenĂ zadnĂho segmentu oka â sĂtnice. Z tÄchto dĆŻvodĆŻ se mnoho vĂœzkumnĂœch pracoviĆĄĆ„ zamÄĆuje prĂĄvÄ na vĂœvoj automatickĂœch metod diagnostiky nemocĂ sĂtnice s vyuĆŸitĂm fundus fotografiĂ. Tato dizertaÄnĂ prĂĄce analyzuje souÄasnĂœ stav vÄdeckĂ©ho poznĂĄnĂ v oblasti diagnostiky glaukomu s vyuĆŸitĂm fundus kamery a navrhuje novou metodiku hodnocenĂ vrstvy nervovĂœch vlĂĄken (VNV) na sĂtnici pomocĂ texturnĂ analĂœzy. Spolu s touto metodikou je navrĆŸena metoda segmentace cĂ©vnĂho ĆeÄiĆĄtÄ sĂtnice, jakoĆŸto dalĆĄĂ hodnotnĂœ pĆĂspÄvek k souÄasnĂ©mu stavu ĆeĆĄenĂ© problematiky. Segmentace cĂ©vnĂho ĆeÄiĆĄtÄ rovnÄĆŸ slouĆŸĂ jako nezbytnĂœ krok pĆedchĂĄzejĂcĂ analĂœzu VNV. Vedle toho prĂĄce publikuje novou volnÄ dostupnou databĂĄzi snĂmkĆŻ sĂtnice se zlatĂœmi standardy pro ĂșÄely hodnocenĂ automatickĂœch metod segmentace cĂ©vnĂho ĆeÄiĆĄtÄ.Fundus camera is widely available imaging device enabling fast and cheap examination of the human retina. Hence, many researchers focus on development of automatic methods towards assessment of various retinal diseases via fundus images. This dissertation summarizes recent state-of-the-art in the field of glaucoma diagnosis using fundus camera and proposes a novel methodology for assessment of the retinal nerve fiber layer (RNFL) via texture analysis. Along with it, a method for the retinal blood vessel segmentation is introduced as an additional valuable contribution to the recent state-of-the-art in the field of retinal image processing. Segmentation of the blood vessels also serves as a necessary step preceding evaluation of the RNFL via the proposed methodology. In addition, a new publicly available high-resolution retinal image database with gold standard data is introduced as a novel opportunity for other researches to evaluate their segmentation algorithms.
- âŠ