7,531 research outputs found

    Segmenting Foreground Objects from a Dynamic Textured Background via a Robust Kalman Filter

    Full text link
    The algorithm presented in this paper aims to segment the foreground objects in video (e.g., people) given time-varying, textured backgrounds. Examples of time-varying backgrounds include waves on water, clouds moving, trees waving in the wind, automobile traffic, moving crowds, escalators, etc. We have developed a novel foreground-background segmentation algorithm that explicitly accounts for the non-stationary nature and clutter-like appearance of many dynamic textures. The dynamic texture is modeled by an Autoregressive Moving Average Model (ARMA). A robust Kalman filter algorithm iteratively estimates the intrinsic appearance of the dynamic texture, as well as the regions of the foreground objects. Preliminary experiments with this method have demonstrated promising results

    Multi-Scale 3D Scene Flow from Binocular Stereo Sequences

    Full text link
    Scene ïŹ‚ow methods estimate the three-dimensional motion ïŹeld for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene ïŹ‚ow estimation that provides reliable results using only two cameras by fusing stereo and optical ïŹ‚ow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical ïŹ‚ow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene ïŹ‚ow than previous methods allow. To handle the aperture problems inherent in the estimation of optical ïŹ‚ow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108

    Joint Material and Illumination Estimation from Photo Sets in the Wild

    Get PDF
    Faithful manipulation of shape, material, and illumination in 2D Internet images would greatly benefit from a reliable factorization of appearance into material (i.e., diffuse and specular) and illumination (i.e., environment maps). On the one hand, current methods that produce very high fidelity results, typically require controlled settings, expensive devices, or significant manual effort. To the other hand, methods that are automatic and work on 'in the wild' Internet images, often extract only low-frequency lighting or diffuse materials. In this work, we propose to make use of a set of photographs in order to jointly estimate the non-diffuse materials and sharp lighting in an uncontrolled setting. Our key observation is that seeing multiple instances of the same material under different illumination (i.e., environment), and different materials under the same illumination provide valuable constraints that can be exploited to yield a high-quality solution (i.e., specular materials and environment illumination) for all the observed materials and environments. Similar constraints also arise when observing multiple materials in a single environment, or a single material across multiple environments. The core of this approach is an optimization procedure that uses two neural networks that are trained on synthetic images to predict good gradients in parametric space given observation of reflected light. We evaluate our method on a range of synthetic and real examples to generate high-quality estimates, qualitatively compare our results against state-of-the-art alternatives via a user study, and demonstrate photo-consistent image manipulation that is otherwise very challenging to achieve

    Real-Time Human Motion Capture with Multiple Depth Cameras

    Full text link
    Commonly used human motion capture systems require intrusive attachment of markers that are visually tracked with multiple cameras. In this work we present an efficient and inexpensive solution to markerless motion capture using only a few Kinect sensors. Unlike the previous work on 3d pose estimation using a single depth camera, we relax constraints on the camera location and do not assume a co-operative user. We apply recent image segmentation techniques to depth images and use curriculum learning to train our system on purely synthetic data. Our method accurately localizes body parts without requiring an explicit shape model. The body joint locations are then recovered by combining evidence from multiple views in real-time. We also introduce a dataset of ~6 million synthetic depth frames for pose estimation from multiple cameras and exceed state-of-the-art results on the Berkeley MHAD dataset.Comment: Accepted to computer robot vision 201

    A survey on 2d object tracking in digital video

    Get PDF
    This paper presents object tracking methods in video.Different algorithms based on rigid, non rigid and articulated object tracking are studied. The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends.It is often the case that tracking objects in consecutive frames is supported by a prediction scheme. Based on information extracted from previous frames and any high level information that can be obtained, the state (location) of the object is predicted.An excellent framework for prediction is kalman filter, which additionally estimates prediction error.In complex scenes, instead of single hypothesis, multiple hypotheses using Particle filter can be used.Different techniques are given for different types of constraints in video

    Analysis of Retinal Image Data to Support Glaucoma Diagnosis

    Get PDF
    Fundus kamera je ĆĄiroce dostupnĂ© zobrazovacĂ­ zaƙízenĂ­, kterĂ© umoĆŸĆˆuje relativně rychlĂ© a nenĂĄkladnĂ© vyĆĄetƙenĂ­ zadnĂ­ho segmentu oka – sĂ­tnice. Z těchto dĆŻvodĆŻ se mnoho vĂœzkumnĂœch pracoviĆĄĆ„ zaměƙuje prĂĄvě na vĂœvoj automatickĂœch metod diagnostiky nemocĂ­ sĂ­tnice s vyuĆŸitĂ­m fundus fotografiĂ­. Tato dizertačnĂ­ prĂĄce analyzuje současnĂœ stav vědeckĂ©ho poznĂĄnĂ­ v oblasti diagnostiky glaukomu s vyuĆŸitĂ­m fundus kamery a navrhuje novou metodiku hodnocenĂ­ vrstvy nervovĂœch vlĂĄken (VNV) na sĂ­tnici pomocĂ­ texturnĂ­ analĂœzy. Spolu s touto metodikou je navrĆŸena metoda segmentace cĂ©vnĂ­ho ƙečiĆĄtě sĂ­tnice, jakoĆŸto dalĆĄĂ­ hodnotnĂœ pƙíspěvek k současnĂ©mu stavu ƙeĆĄenĂ© problematiky. Segmentace cĂ©vnĂ­ho ƙečiĆĄtě rovnÄ›ĆŸ slouĆŸĂ­ jako nezbytnĂœ krok pƙedchĂĄzejĂ­cĂ­ analĂœzu VNV. Vedle toho prĂĄce publikuje novou volně dostupnou databĂĄzi snĂ­mkĆŻ sĂ­tnice se zlatĂœmi standardy pro Ășčely hodnocenĂ­ automatickĂœch metod segmentace cĂ©vnĂ­ho ƙečiĆĄtě.Fundus camera is widely available imaging device enabling fast and cheap examination of the human retina. Hence, many researchers focus on development of automatic methods towards assessment of various retinal diseases via fundus images. This dissertation summarizes recent state-of-the-art in the field of glaucoma diagnosis using fundus camera and proposes a novel methodology for assessment of the retinal nerve fiber layer (RNFL) via texture analysis. Along with it, a method for the retinal blood vessel segmentation is introduced as an additional valuable contribution to the recent state-of-the-art in the field of retinal image processing. Segmentation of the blood vessels also serves as a necessary step preceding evaluation of the RNFL via the proposed methodology. In addition, a new publicly available high-resolution retinal image database with gold standard data is introduced as a novel opportunity for other researches to evaluate their segmentation algorithms.
    • 

    corecore