2,582 research outputs found

    Multiple cue integration for robust tracking in dynamic environments: application to video relighting

    Get PDF
    L'anàlisi de moviment i seguiment d'objectes ha estat un dels pricipals focus d'atenció en la comunitat de visió per computador durant les dues darreres dècades. L'interès per aquesta àrea de recerca resideix en el seu ample ventall d'aplicabilitat, que s'extén des de tasques de navegació de vehicles autònoms i robots, fins a aplications en la indústria de l'entreteniment i realitat virtual.Tot i que s'han aconseguit resultats espectaculars en problemes específics, el seguiment d'objectes continua essent un problema obert, ja que els mètodes disponibles són propensos a ser sensibles a diversos factors i condicions no estacionàries de l'entorn, com ara moviments impredictibles de l'objecte a seguir, canvis suaus o abruptes de la il·luminació, proximitat d'objectes similars o fons confusos. Enfront aquests factors de confusió la integració de múltiples característiques ha demostrat que permet millorar la robustesa dels algoritmes de seguiment. En els darrers anys, degut a la creixent capacitat de càlcul dels ordinadors, hi ha hagut un significatiu increment en el disseny de complexes sistemes de seguiment que consideren simultàniament múltiples característiques de l'objecte. No obstant, la majoria d'aquests algoritmes estan basats enheurístiques i regles ad-hoc formulades per aplications específiques, fent-ne impossible l'extrapolació a noves condicions de l'entorn.En aquesta tesi proposem un marc probabilístic general per integrar el nombre de característiques de l'objecte que siguin necessàries, permetent que interactuin mútuament per tal d'estimar-ne el seu estat amb precisió, i per tant, estimar amb precisió la posició de l'objecte que s'està seguint. Aquest marc, s'utilitza posteriorment per dissenyar un algoritme de seguiment, que es valida en diverses seqüències de vídeo que contenen canvis abruptes de posició i il·luminació, camuflament de l'objecte i deformacions no rígides. Entre les característiques que s'han utilitzat per representar l'objecte, cal destacar la paramatrització robusta del color en un espai de color dependent de l'objecte, que permet distingir-lo del fons més clarament que altres espais de color típicament ulitzats al llarg de la literatura.En la darrera part de la tesi dissenyem una tècnica per re-il·luminar tant escenes estàtiques com en moviment, de les que s'en desconeix la geometria. La re-il·luminació es realitza amb un mètode 'basat en imatges', on la generació de les images de l'escena sota noves condicions d'il·luminació s'aconsegueix a partir de combinacions lineals d'un conjunt d'imatges de referència pre-capturades, i que han estat generades il·luminant l'escena amb patrons de llum coneguts. Com que la posició i intensitat de les fonts d'il.luminació que formen aquests patrons de llum es pot controlar, és natural preguntar-nos: quina és la manera més òptima d'il·luminar una escena per tal de reduir el nombre d'imatges de referència? Demostrem que la millor manera d'il·luminar l'escena (és a dir, la que minimitza el nombre d'imatges de referència) no és utilitzant una seqüència de fonts d'il·luminació puntuals, com es fa generalment, sinó a través d'una seqüència de patrons de llum d'una base d'il·luminació depenent de l'objecte. És important destacar que quan es re-il·luminen seqüències de vídeo, les imatges successives s'han d'alinear respecte a un sistema de coordenades comú. Com que cada imatge ha estat generada per un patró de llum diferent il·uminant l'escena, es produiran canvis d'il·luminació bruscos entre imatges de referència consecutives. Sota aquestes circumstàncies, el mètode de seguiment proposat en aquesta tesi juga un paper fonamental. Finalment, presentem diversos resultats on re-il·luminem seqüències de vídeo reals d'objectes i cares d'actors en moviment. En cada cas, tot i que s'adquireix un únic vídeo, som capaços de re-il·luminar una i altra vegada, controlant la direcció de la llum, la seva intensitat, i el color.Motion analysis and object tracking has been one of the principal focus of attention over the past two decades within the computer vision community. The interest of this research area lies in its wide range of applicability, extending from autonomous vehicle and robot navigation tasks, to entertainment and virtual reality applications.Even though impressive results have been obtained in specific problems, object tracking is still an open problem, since available methods are prone to be sensitive to several artifacts and non-stationary environment conditions, such as unpredictable target movements, gradual or abrupt changes of illumination, proximity of similar objects or cluttered backgrounds. Multiple cue integration has been proved to enhance the robustness of the tracking algorithms in front of such disturbances. In recent years, due to the increasing power of the computers, there has been a significant interest in building complex tracking systems which simultaneously consider multiple cues. However, most of these algorithms are based on heuristics and ad-hoc rules formulated for specific applications, making impossible to extrapolate them to new environment conditions.In this dissertation we propose a general probabilistic framework to integrate as many object features as necessary, permitting them to mutually interact in order to obtain a precise estimation of its state, and thus, a precise estimate of the target position. This framework is utilized to design a tracking algorithm, which is validated on several video sequences involving abrupt position and illumination changes, target camouflaging and non-rigid deformations. Among the utilized features to represent the target, it is important to point out the use of a robust parameterization of the target color in an object dependent colorspace which allows to distinguish the object from the background more clearly than other colorspaces commonly used in the literature.In the last part of the dissertation, we design an approach for relighting static and moving scenes with unknown geometry. The relighting is performed through an -image-based' methodology, where the rendering under new lighting conditions is achieved by linear combinations of a set of pre-acquired reference images of the scene illuminated by known light patterns. Since the placement and brightness of the light sources composing such light patterns can be controlled, it is natural to ask: what is the optimal way to illuminate the scene to reduce the number of reference images that are needed? We show that the best way to light the scene (i.e., the way that minimizes the number of reference images) is not using a sequence of single, compact light sources as is most commonly done, but rather to use a sequence of lighting patterns as given by an object-dependent lighting basis. It is important to note that when relighting video sequences, consecutive images need to be aligned with respect to a common coordinate frame. However, since each frame is generated by a different light pattern illuminating the scene, abrupt illumination changes between consecutive reference images are produced. Under these circumstances, the tracking framework designed in this dissertation plays a central role. Finally, we present several relighting results on real video sequences of moving objects, moving faces, and scenes containing both. In each case, although a single video clip was captured, we are able to relight again and again, controlling the lighting direction, extent, and color.Postprint (published version

    Dependent multiple cue integration for robust tracking

    Get PDF
    We propose a new technique for fusing multiple cues to robustly segment an object from its background in video sequences that suffer from abrupt changes of both illumination and position of the target. Robustness is achieved by the integration of appearance and geometric object features and by their estimation using Bayesian filters, such as Kalman or particle filters. In particular, each filter estimates the state of a specific object feature, conditionally dependent on another feature estimated by a distinct filter. This dependence provides improved target representations, permitting us to segment it out from the background even in nonstationary sequences. Considering that the procedure of the Bayesian filters may be described by a "hypotheses generation-hypotheses correction" strategy, the major novelty of our methodology compared to previous approaches is that the mutual dependence between filters is considered during the feature observation, that is, into the "hypotheses-correction" stage, instead of considering it when generating the hypotheses. This proves to be much more effective in terms of accuracy and reliability. The proposed method is analytically justified and applied to develop a robust tracking system that adapts online and simultaneously the color space where the image points are represented, the color distributions, the contour of the object, and its bounding box. Results with synthetic data and real video sequences demonstrate the robustness and versatility of our method.Peer Reviewe

    A target dependent colorspace for robust tracking

    Get PDF
    Presentado al 18th International Conference on Pattern Recognition (ICPR)celebrado en 2006 en Hong Kong (China).The selection of the appropriate colorspace for tracking applications has not been an issue previously considered in the literature. Many color representations have been suggested, based on the invariance to illumination changes. Nevertheless, none of them is invariant enough to deal with general and unconstrained environments. In tracking tasks, we might prefer to represent image pixels into a colorspace where the distance between the target and background colorpoints were maximized, simplifying the task of the tracker. Based on this criterion, we propose an 'object dependent' colorspace, which is computed as a simple calibration procedure before tracking. Furthermore, this colorspace may be easily adapted at each frame. Synthetic and real experiments show how this colorspace allows for a better discrimination of the foreground and background, and permits to track in circumstances where the same tracking algorithm relying on other colorspaces would fail.This work was supported by the project 'Integration of robust perception, learning, and navigation systems in mobile robotics' (J-0929).This work was supported by CICYT project DPI2004-05414 from the Spanish Ministry of Science and Technology.Peer Reviewe

    Factorized Topic Models

    Full text link
    In this paper we present a modification to a latent topic model, which makes the model exploit supervision to produce a factorized representation of the observed data. The structured parameterization separately encodes variance that is shared between classes from variance that is private to each class by the introduction of a new prior over the topic space. The approach allows for a more eff{}icient inference and provides an intuitive interpretation of the data in terms of an informative signal together with structured noise. The factorized representation is shown to enhance inference performance for image, text, and video classification.Comment: ICLR 201

    Change detection in optical aerial images by a multilayer conditional mixed Markov model

    Get PDF
    In this paper we propose a probabilistic model for detecting relevant changes in registered aerial image pairs taken with the time differences of several years and in different seasonal conditions. The introduced approach, called the Conditional Mixed Markov model (CXM), is a combination of a mixed Markov model and a conditionally independent random field of signals. The model integrates global intensity statistics with local correlation and contrast features. A global energy optimization process ensures simultaneously optimal local feature selection and smooth, observation-consistent segmentation. Validation is given on real aerial image sets provided by the Hungarian Institute of Geodesy, Cartography and Remote Sensing and Google Earth

    Learning object boundary detection from motion data

    Get PDF
    A significant barrier to applying the techniques of machine learning to the domain of object boundary detection is the need to obtain a large database of correctly labeled examples. Inspired by developmental psychology, this paper proposes that boundary detection can be learned from the output of a motion tracking algorithm that separates moving objects from their static surroundings. Motion segmentation solves the database problem by providing cheap, unlimited, labeled training data. A probabilistic model of the textural and shape properties of object boundaries can be trained from this data and then used to efficiently detect boundaries in novel images via loopy belief propagation.Singapore-MIT Alliance (SMA

    Active appearance pyramids for object parametrisation and fitting

    Get PDF
    Object class representation is one of the key problems in various medical image analysis tasks. We propose a part-based parametric appearance model we refer to as an Active Appearance Pyramid (AAP). The parts are delineated by multi-scale Local Feature Pyramids (LFPs) for superior spatial specificity and distinctiveness. An AAP models the variability within a population with local translations of multi-scale parts and linear appearance variations of the assembly of the parts. It can fit and represent new instances by adjusting the shape and appearance parameters. The fitting process uses a two-step iterative strategy: local landmark searching followed by shape regularisation. We present a simultaneous local feature searching and appearance fitting algorithm based on the weighted Lucas and Kanade method. A shape regulariser is derived to calculate the maximum likelihood shape with respect to the prior and multiple landmark candidates from multi-scale LFPs, with a compact closed-form solution. We apply the 2D AAP on the modelling of variability in patients with lumbar spinal stenosis (LSS) and validate its performance on 200 studies consisting of routine axial and sagittal MRI scans. Intervertebral sagittal and parasagittal cross-sections are typically used for the diagnosis of LSS, we therefore build three AAPs on L3/4, L4/5 and L5/S1 axial cross-sections and three on parasagittal slices. Experiments show significant improvement in convergence range, robustness to local minima and segmentation precision compared with Constrained Local Models (CLMs), Active Shape Models (ASMs) and Active Appearance Models (AAMs), as well as superior performance in appearance reconstruction compared with AAMs. We also validate the performance on 3D CT volumes of hip joints from 38 studies. Compared to AAMs, AAPs achieve a higher segmentation and reconstruction precision. Moreover, AAPs have a significant improvement in efficiency, consuming about half the memory and less than 10% of the training time and 15% of the testing time

    Avtonomna segmentacija slik z Markovim slučajnim poljem

    Get PDF
    Segmentacija slik je zelo raziskovano področje, za katero so na voljo številni algoritmi. Naš cilj je segmentacija slike s pomočjo superpikslov na več skladnih delov in na nenadzorovan način. Da bi to dosegli, predlagamo iterativni segmentacijski algoritem. Algoritem predstavlja sliko kot slučajno polje Markova (MRF), katerega vozlišča so superpiksli, ki imajo barvne in teksturne atribute. Superpikslom dodelimo oznake na podlagi njihovih atributov s pomočjo metode podpornih vektorjev (SVM) in že omenjenega MRF in iterativno zmanjšujemo število segmentov. Negotovo segmentacijo po vsaki iteraciji se izboljšuje in rezultat je segmentacija slike na več semantično smiselnih delov, brez pomoči uporabnika. Algoritem je bil testiran na segmentacijsko podatkovno bazo in F ocene so podobne najsodobnejšim algoritmom. Glede fragmentacije slike naš pristop bistveno prekosi stanje tehnike z zmanjšanjem števila segmentov, iz katerih je sestavljen predmet zanimanja.Image segmentation is a widely-researched topic with many algorithms available. Our goal is to segment an image, in an unsupervised way, into several coherent parts with the help of superpixels. To achieve that, we propose an iterative segmentation algorithm. The algorithm models the image by a Markov random field, whose nodes are the superpixels, and each node has both color and texture features. The superpixels are assigned labels according to their features with the help of support vector machines and the aforementioned MRF and the number of segments is iteratively reduced. The result is a segmentation of an image into several regions with requiring any user input. The segmentation algorithm was tested on a standard evaluation database, and performs on par with state-of-the-art segmentation algorithms in F-measures. In terms of oversegmentation, our approach significantly outperforms the state of the art by greatly reducing the oversegmentation of the object of interest