10 research outputs found

    A general method for the point of regard estimation in 3D space

    Full text link

    Implementing a multi-model estimation method

    Get PDF
    This work is realized within the scope of a general attempt to understand parametric adaptation, regarding visual perception. The key idea is to analyze how we may use multi-model parametric estimation as a 1st step towards categorization. More generally, the goal is to formalize how the notion of ``objects'' or ``events'' in an application may be reduced to a choice in a hierarchy of parametric models used to estimate the underlying data categorization. These mechanisms are to be linked with what occurs in the cerebral cortex where object recognition corresponds to a parametric neuronal estimation (see for instanced Page 2000 for a discussion and Freedman et al 2001 for an example regarding the primate visual cortex). We thus hope to bring here an algorithmic element in relation with the ``grand-ma'' neuron modelization. We thus revisit the problem of parameter estimation in computer vision, presented here as a simple optimization problem, considering (i) non-linear implicit measurement equations and parameter constraints, plus (ii) robust estimation in the presence of outliers and (iii) multi-model comparisons. Here, (1) a projection algorithm based on generalizations of square-root decompositions allows an efficient and numerically stable local resolution of a set of non-linear equations. On the other hand, (2) a robust estimation module of a hierarchy of non-linear models has been designed and validated. A step ahead, the software architecture of the estimation module is discussed with the goal of being integrated in reactive software environments or within applications with time constraints

    Methods for Recognizing Pose and Action of Articulated Objects with Collection of Planes in Motion

    Get PDF
    The invention comprises an improved system, method, and computer-readable instructions for recognizing pose and action of articulated objects with collection of planes in motion. The method starts with a video sequence and a database of reference sequences corresponding to different known actions. The method identifies the sequence from the reference sequences such that the subject in performs the closest action to that observed. The method compares actions by comparing pose transitions. The cross-homography invariant may be used for view-invariant recognition of human body pose transition and actions

    A three-parameter affine approximation of focus and zoom variations

    Get PDF
    This work aims to develop a dynamic approximation to the variation in a lens' intrinsic parameters when the zoom and focus parameters are modified . This approximation is built by tracking two generic points in a monocular image sequence. Our preliminary analysis demonstrates that a particular three-parameter affine model is sufficient to describe these modifications . The general affine model is not acceptable on a mathematical or physical level : the mathematical transformation to be used has only five parameters, instead of six, and an analysis of the physics reveals that three parameters are sufficient . Experimentally, this approximation is entirely valid, with the precision being better than 1 .5 pixels in almost every case . Using a least-squares method, we obtain very simple equations in which the precision of the estimate increases with the number of available correspondences .Dans ce travail, on approxime dynamiquement le changement de focale et de la mise au point en suivant deux points quelconques entre deux images prises par la même caméra. Plus précisément, on étudie la variation des paramètes intrinsèques lors d'un changement de la mise au point et du zoom. On démontre grâce à cette analyse, qu'un modèle de transformation affine à 3 paramètres est tout à fait suffisant, et qu'un modèle de transformation affine général ne se justifie pas, car la transformation à utiliser n'a - mathématiquement - que 5 paramètres et non 6 tandis que l'analyse physique du système montre que 3 paramètres suffisent. Expérimentalement, le modèle est justifié, car la précision est meilleure que 1.5 pixel pour des variations de la mise au point et dans tous les cas meilleure que si l'on utilise le modèle général. Par une méthode des moindres carrés on accède à des équations très simples qui nous permettent d'obtenir une précision dépendant du nombre des points suivi

    Study Of Human Activity In Video Data With An Emphasis On View-invariance

    Get PDF
    The perception and understanding of human motion and action is an important area of research in computer vision that plays a crucial role in various applications such as surveillance, HCI, ergonomics, etc. In this thesis, we focus on the recognition of actions in the case of varying viewpoints and different and unknown camera intrinsic parameters. The challenges to be addressed include perspective distortions, differences in viewpoints, anthropometric variations, and the large degrees of freedom of articulated bodies. In addition, we are interested in methods that require little or no training. The current solutions to action recognition usually assume that there is a huge dataset of actions available so that a classifier can be trained. However, this means that in order to define a new action, the user has to record a number of videos from different viewpoints with varying camera intrinsic parameters and then retrain the classifier, which is not very practical from a development point of view. We propose algorithms that overcome these challenges and require just a few instances of the action from any viewpoint with any intrinsic camera parameters. Our first algorithm is based on the rank constraint on the family of planar homographies associated with triplets of body points. We represent action as a sequence of poses, and decompose the pose into triplets. Therefore, the pose transition is broken down into a set of movement of body point planes. In this way, we transform the non-rigid motion of the body points into a rigid motion of body point iii planes. We use the fact that the family of homographies associated with two identical poses would have rank 4 to gauge similarity of the pose between two subjects, observed by different perspective cameras and from different viewpoints. This method requires only one instance of the action. We then show that it is possible to extend the concept of triplets to line segments. In particular, we establish that if we look at the movement of line segments instead of triplets, we have more redundancy in data thus leading to better results. We demonstrate this concept on “fundamental ratios.” We decompose a human body pose into line segments instead of triplets and look at set of movement of line segments. This method needs only three instances of the action. If a larger dataset is available, we can also apply weighting on line segments for better accuracy. The last method is based on the concept of “Projective Depth”. Given a plane, we can find the relative depth of a point relative to the given plane. We propose three different ways of using “projective depth:” (i) Triplets - the three points of a triplet along with the epipole defines the plane and the movement of points relative to these body planes can be used to recognize actions; (ii) Ground plane - if we are able to extract the ground plane, we can find the “projective depth” of the body points with respect to it. Therefore, the problem of action recognition would translate to curve matching; and (iii) Mirror person - We can use the mirror view of the person to extract mirror symmetric planes. This method also needs only one instance of the action. Extensive experiments are reported on testing view invariance, robustness to noisy localization and occlusions of body points, and action recognition. The experimental results are very promising and demonstrate the efficiency of our proposed invariants. i

    Implementing a multi-model estimation method

    Get PDF
    This work is realized within the scope of a general attempt to understand parametric adaptation, regarding visual perception. The key idea is to analyze how we may use multi-model parametric estimation as a 1st step towards categorization. More generally, the goal is to formalize how the notion of ``objects'' or ``events'' in an application may be reduced to a choice in a hierarchy of parametric models used to estimate the underlying data categorization. These mechanisms are to be linked with what occurs in the cerebral cortex where object recognition corresponds to a parametric neuronal estimation (see for instanced Page 2000 for a discussion and Freedman et al 2001 for an example regarding the primate visual cortex). We thus hope to bring here an algorithmic element in relation with the ``grand-ma'' neuron modelization. We thus revisit the problem of parameter estimation in computer vision, presented here as a simple optimization problem, considering (i) non-linear implicit measurement equations and parameter constraints, plus (ii) robust estimation in the presence of outliers and (iii) multi-model comparisons. Here, (1) a projection algorithm based on generalizations of square-root decompositions allows an efficient and numerically stable local resolution of a set of non-linear equations. On the other hand, (2) a robust estimation module of a hierarchy of non-linear models has been designed and validated. A step ahead, the software architecture of the estimation module is discussed with the goal of being integrated in reactive software environments or within applications with time constraints

    Accurate 3D shape and displacement measurement using a scanning electron microscope

    Get PDF
    With the current development of nano-technology, there exists an increasing demand for three-dimensional shape and deformation measurements at this reduced-length scale in the field of materials research. Images acquired by \ud Scanning Electron Microscope (SEM) systems coupled with analysis by Digital Image Correlation (DIC) is an interesting combination for development of a high magnification measurement system. However, a SEM is designed for visualization, not for metrological studies, and the application of DIC to the micro- or nano-scale with such a system faces the challenges of calibrating the imaging system and correcting the spatially-varying and \ud time-varying distortions in order to obtain accurate measurements. Moreover, the SEM provides only a single sensor and recovering 3D information is not possible with the classical stereo-vision approach. But the specimen being mounted on the mobile SEM stage, images can be acquired from multiple viewpoints and 3D reconstruction is possible using the principle of videogrammetry for recovering the unknown rigid-body motions undergone by \ud the specimen.\ud The dissertation emphasizes the new calibration methodology that has been developed because it is a major contribution for the accuracy of 3D shape and deformation measurements at reduced-length scale. It proves that, unlike previous works, image drift and distortion must be taken into account if accurate measurements are to be made with such a system. Necessary background and required theoretical knowledge for the 3D shape measurement using videogrammetry and for in-plane and out-of-plane deformation measurement are presented in details as well. In order to validate our work and demonstrate in particular the obtained measurement accuracy, experimental results resulting from different applications are presented throughout the different chapters. At last, a software gathering different computer vision applications has been developed.\ud Avec le développement actuel des nano-technologies, la demande en matière d'étude du comportement des matériaux à des échelles micro ou nanoscopique ne cesse d'augmenter. Pour la mesure de forme ou de déformations tridimensionnelles à ces échelles de grandeur,l'acquisition d'images à partir d'un Microscope électronique à Balayage (MEB) couplée à l'analyse par corrélation d'images numériques s'est avérée une technique intéressante. \ud Cependant, un MEB est un outil conçu essentiellement pour de la visualisation et son utilisation pour des mesures tridimensionnelles précises pose un certain nombre de difficultés comme par exemple le calibrage du système et la \ud correction des fortes distorsions (spatiales et temporelles) présentes dans les images. De plus, le MEB ne possède qu'un seul capteur et les informations tridimensionnelles souhaitées ne peuvent pas être obtenues par une approche classique de type stéréovision. Cependant, l'échantillon à analyser étant monté sur un support orientable, des images peuvent être acquises sous différents points de vue, ce qui permet une reconstruction tridimensionnelle en utilisant le principe de vidéogrammétrie pour retrouver à partir des seules images les mouvements inconnus du porte-échantillon.\ud La thèse met l'accent sur la nouvelle technique de calibrage et de correction des distorsions développée car c'est une contribution majeure pour la précision de la mesure de forme et de déformations 3D aux échelles de \ud grandeur étudiées. Elle prouve que, contrairement aux travaux précédents, la prise en compte de la dérive temporelle et des distorsions spatiales d'images \ud est indispensable pour obtenir une précision de mesure suffisante. Les principes permettant la mesure de forme par vidéogrammétrie et le calcul de déformations 2D et 3D sont aussi présentés en détails. Enfin, et dans le but de valider nos travaux et démontrer en particulier la précision de mesure obtenue, des résultats expérimentaux issus de différentes applications sont présentés.\ud \ud \u

    Synthèse de vues à partir d'images prises par des caméras stéréoscopiques non calibrées

    Get PDF
    Relations géométriques entre caméras -- Géométrie projective en vision -- Modèle de la caméra -- Caméras stéréoscopiques -- Calcul de la matrice fondamentale -- Approche proposée : exploitation de la contrainte de planarité -- Invariants projectifs -- Introduction aux invariants projectifs -- Invariant basé sur un point de référence et deux homographies -- Invariant basé sur une ligne de référence et une seule homographie -- Synthèse de vues -- État de l'art en synthèse de vues -- Approche proposée -- Techiques de transfert de primitives d'intérêt -- Texturage bidimensionnel -- Algorithme de synthèse de vues -- Géométrie épipolaire -- Invariants projectifs
    corecore