10 research outputs found

    Monocular object pose computation with the foveal-peripheral camera of the humanoid robot Armar-III

    Get PDF
    Active contour modelling is useful to fit non-textured objects, and algorithms have been developed to recover the motion of an object and its uncertainty. Here we show that these algorithms can be used also with point features matched in textured objects, and that active contours and point matches complement in a natural way. In the same manner we also show that depth-from-zoom algorithms, developed for zooming cameras, can be exploited also in the foveal-peripheral eye configuration present in the Armar-III humanoid robot.Peer Reviewe

    Layered motion segmentation and depth ordering by tracking edges

    Full text link

    SEGMENTATION, RECOGNITION, AND ALIGNMENT OF COLLABORATIVE GROUP MOTION

    Get PDF
    Modeling and recognition of human motion in videos has broad applications in behavioral biometrics, content-based visual data analysis, security and surveillance, as well as designing interactive environments. Significant progress has been made in the past two decades by way of new models, methods, and implementations. In this dissertation, we focus our attention on a relatively less investigated sub-area called collaborative group motion analysis. Collaborative group motions are those that typically involve multiple objects, wherein the motion patterns of individual objects may vary significantly in both space and time, but the collective motion pattern of the ensemble allows characterization in terms of geometry and statistics. Therefore, the motions or activities of an individual object constitute local information. A framework to synthesize all local information into a holistic view, and to explicitly characterize interactions among objects, involves large scale global reasoning, and is of significant complexity. In this dissertation, we first review relevant previous contributions on human motion/activity modeling and recognition, and then propose several approaches to answer a sequence of traditional vision questions including 1) which of the motion elements among all are the ones relevant to a group motion pattern of interest (Segmentation); 2) what is the underlying motion pattern (Recognition); and 3) how two motion ensembles are similar and how we can 'optimally' transform one to match the other (Alignment). Our primary practical scenario is American football play, where the corresponding problems are 1) who are offensive players; 2) what are the offensive strategy they are using; and 3) whether two plays are using the same strategy and how we can remove the spatio-temporal misalignment between them due to internal or external factors. The proposed approaches discard traditional modeling paradigm but explore either concise descriptors, hierarchies, stochastic mechanism, or compact generative model to achieve both effectiveness and efficiency. In particular, the intrinsic geometry of the spaces of the involved features/descriptors/quantities is exploited and statistical tools are established on these nonlinear manifolds. These initial attempts have identified new challenging problems in complex motion analysis, as well as in more general tasks in video dynamics. The insights gained from nonlinear geometric modeling and analysis in this dissertation may hopefully be useful toward a broader class of computer vision applications

    움직이는 단일 카메라를 이용한 3차원 복원과 디블러링, 초해상도 복원의 동시적 수행 기법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2013. 8. 이경무.영상 기반 3차원 복원은 컴퓨터 비전의 기본적인 연구 주제 가운데 하나로 최근 몇 년간 많은 발전이 있어왔다. 특히 자동 로봇을 위한 네비게이션 및 휴대 기기를 이용한 증강 현실 등에 널리 활용될 수 있는 단일 카메라를 이용한 3차원 복원 기법은 복원의 정확도, 복원 가능 범위 및 처리 속도 측면에서 많은 실용 가능성을 보여주고 있다. 그러나 그 성능은 여전히 조심스레 촬영된 높은 품질의 입력 영상에 대해서만 시험되고 있다. 움직이는 단일 카메라를 이용한 3차원 복원의 실제 동작 환경에서는 입력 영상이 화소 잡음이나 움직임에 의한 번짐 등에 의하여 손상될 수 있고, 영상의 해상도 또한 정확한 카메라 위치 인식 및 3차원 복원을 위해서는 충분히 높지 않을 수 있다. 많은 연구에서 고성능 영상 화질 향상 기법들이 제안되어 왔지만 이들은 일반적으로 높은 계산 비용을 필요로 하기 때문에 실시간 동작 능력이 중요한 단일 카메라 기반 3차원 복원에 사용되기에는 부적합하다. 본 논문에서는 보다 정확하고 안정된 복원을 위하여 영상 개선이 결합된 새로운 단일 카메라 기반 3차원 복원 기법을 다룬다. 이를 위하여 영상 품질이 저하되는 중요한 두 요인인 움직임에 의한 영상 번짐과 낮은 해상도 문제가 각각 점 기반 복원 및 조밀 복원 기법들과 결합된다. 영상 품질 저하를 포함한 영상 획득 과정은 카메라 및 장면의 3차원 기하 구조와 관측된 영상 사이의 관계를 이용하여 모델링 할 수 있고, 이러한 영상 품질 저하 과정을 고려함으로써 정확한 3차원 복원을 하는 것이 가능해진다. 또한, 영상 번짐 제거를 위한 번짐 커널 또는 영상의 초해상도 복원을 위한 화소 대응 정보 등이 3차원 복원 과정과 동시에 얻어지는것이 가능하여, 영상 개선이 보다 간편하고 빠르게 수행될 수 있다. 제안되는 기법은 3차원 복원과 영상 개선 문제를 동시에 해결함으로써 각각의 결과가 상호 보완적으로 향상된다는 점에서 그 장점을 가지고 있다. 본 논문에서는 실험적 평가를 통하여 제안되는 3차원 복원 및 영상 개선의 효과성을 입증하도록 한다.Vision-based 3D reconstruction is one of the fundamental problems in computer vision, and it has been researched intensively significantly in the last decades. In particular, 3D reconstruction using a single camera, which has a wide range of applications such as autonomous robot navigation and augmented reality, shows great possibilities in its reconstruction accuracy, scale of reconstruction coverage, and computational efficiency. However, until recently, the performances of most algorithms have been tested only with carefully recorded, high quality input sequences. In practical situations, input images for 3D reconstruction can be severely distorted due to various factors such as pixel noise and motion blur, and the resolution of images may not be high enough to achieve accurate camera localization and scene reconstruction results. Although various high-performance image enhancement methods have been proposed in many studies, the high computational costs of those methods prevent applying them to the 3D reconstruction systems where the real-time capability is an important issue. In this dissertation, novel single camera-based 3D reconstruction methods that are combined with image enhancement methods is studied to improve the accuracy and reliability of 3D reconstruction. To this end, two critical image degradations, motion blur and low image resolution, are addressed for both sparse reconstruction and dense 3D reconstruction systems, and novel integrated enhancement methods for those degradations are presented. Using the relationship between the observed images and 3D geometry of the camera and scenes, the image formation process including image degradations is modeled by the camera and scene geometry. Then, by taking the image degradation factors in consideration, accurate 3D reconstruction then is achieved. Furthermore, the information required for image enhancement, such as blur kernels for deblurring and pixel correspondences for super-resolution, is simultaneously obtained while reconstructing 3D scene, and this makes the image enhancement much simpler and faster. The proposed methods have an advantage that the results of 3D reconstruction and image enhancement are improved by each other with the simultaneous solution of these problems. Experimental evaluations demonstrate the effectiveness of the proposed 3D reconstruction and image enhancement methods.1. Introduction 2. Sparse 3D Reconstruction and Image Deblurring 3. Sparse 3D Reconstruction and Image Super-Resolution 4. Dense 3D Reconstruction and Image Deblurring 5. Dense 3D Reconstruction and Image Super-Resolution 6. Dense 3D Reconstruction, Image Deblurring, and Super-Resolution 7. ConclusionDocto

    Estimació del moviment de robots mitjançant contorns actius

    Get PDF
    Aquesta tesi versa sobre l'estimació del moviment d'un robot mòbil a partir dels canvis en les imatges captades per una càmera muntada sobre el robot. El moviment es dedueix amb un algorisme prèviament proposat en el marc de la navegació qualitativa. Per tal d'emprar aquest algorisme en casos reals s'ha fet un estudi de la seva precisió. Per augmentar-ne l'aplicabilitat, s'ha adaptat l'algorisme al cas d'una càmera amb moviments d'orientació i de zoom.Quan els efectes perspectius no són importants, dues vistes d'una escena captades pel robot es poden relacionar amb una transformació afí (o afinitat), que normalment es calcula a partir de correspondències de punts. En aquesta tesi es vol seguir un enfoc alternatiu, i alhora complementari, fent servir la silueta d'un objecte modelada mitjançant un contorn actiu. El marc es el següent: a mesura que el robot es va movent, la projecció de l'objecte a la imatge va canviant i el contorn actiu es deforma convenientment per adaptar-s'hi; de les deformacions d'aquest contorn, expressades en espai de forma, se'n pot extreure el moviment del robot fins a un factor d'escala. Els contorns actius es caracteritzen per la rapidesa en la seva extracció i la seva robustesa a oclusions parcials. A més, un contorn és fàcil de trobar fins i tot en escenes poc texturades, on sovint és difícil trobar punts característics i la seva correspondència.La primera part d'aquest treball té l'objectiu de caracteritzar la precisió i la incertesa en l'estimació del moviment. Per avaluar la precisió, primer es duen a terme un parell d'experiències pràctiques, que mostren la potencialitat de l'algorisme en entorns reals i amb diferents robots. Estudiant la geometria epipolar que relaciona dues vistes d'un objecte planar es demostra que la direcció epipolar afí es pot recuperar en el cas que el moviment de la càmera estigui lliure de ciclorotació. Amb una bateria d'experiments, tant en simulació com reals, es fa servir la direcció epipolar per caracteritzar la precisió global de l'afinitat en diferents situacions, com ara, davant de diferents formes dels contorns, condicions de visualització extremes i soroll al sistema.Pel que fa a la incertesa, gràcies a que la implementació es basa en el filtre de Kalman, per a cada estimació del moviment també es té una estimació de la incertesa associada, però expressada en espai de forma. Per tal propagar la incertesa de l'espai de forma a l'espai de moviment 3D s'han seguit dos camins diferents: un analític i l'altre estadístic. Aquest estudi ha permès determinar quins graus de llibertat es recuperen amb més precisió, i quines correlacions existeixen entre les diferents components. Finalment, s'ha desenvolupat un algorisme que permet propagar la incertesa del moviment en temps de vídeo. Una de les limitacions més importants d'aquesta metodologia és que cal que la projecció de l'objecte estigui dins de la imatge i en condicions de visualització de perspectiva dèbil durant tota la seqüència. En la segona part d'aquest treball, s'estudia el seguiment de contorns actius en el marc de la visió activa per tal de superar aquesta limitació. És una relació natural, atès que el seguiment de contorns actius es pot veure com una tècnica per fixar el focus d'atenció. En primer lloc, s'han estudiat les propietats de les càmeres amb zoom i s'ha proposat un nou algorisme per determinar la profunditat de la càmera respecte a un objecte qualsevol. L'algorisme inclou un senzill calibratge geomètric que no implica cap coneixement sobre els paràmetres interns de la càmera. Finalment, per tal d'orientar la càmera adequadament, compensant en la mesura del possible els moviments del robot, s'ha desenvolupat un algorisme per al control dels mecanismes de zoom, capcineig i guinyada, i s'ha adaptat l'algorisme d'estimació del moviment incorporant-hi els girs coneguts del capcineig i la guinyada.This thesis deals with the motion estimation of a mobile robot from changes in the images acquired by a camera mounted on the robot itself. The motion is deduced with an algorithm previously proposed in the framework of qualitative navigation. In order to employ this algorithm in real situations, a study of its accuracy has been performed. Moreover, relationships with the active vision paradigm have been analyzed, leading to an increase in its applicability.When perspective effects are not significant, two views of a scene are related by an affine transformation (or affinity), that it is usually computed from point correspondences. In this thesis we explore an alternative and at the same time complementary approach, using the contour of an object modeled by means of an active contour. The framework is the following: when the robot moves, the projection of the object in the image changes and the active contour adapts conveniently to it; from the deformation of this contour, expressed in shape space, the robot egomotion can be extracted up to a scale factor. Active contours are characterized by the speed of their extraction and their robustness to partial occlusions. Moreover, a contour is easy to find even in poorly textured scenes, where often it is difficult to find point features and their correspondences.The goal of the first part of this work is to characterize the accuracy and the uncertainty in the motion estimation. Some practical experiences are carried out to evaluate the accuracy, showing the potentiality of the algorithm in real environments and with different robots. We have studied also the epipolar geometry relating two views of a planar object. We prove that the affine epipolar direction between two images can be recovered from a shape vector when the camera motion is free of cyclorotation. With a battery of simulated as well as real experiments, the epipolar direction allows us to analyze the global accuracy of the affinity in a variety of situations: different contour shapes, extreme visualization conditions and presence of noise.Regarding uncertainty, since the implementation is based on a Kalman filter, for each motion estimate we have also its covariance matrix expressed in shape space. In order to propagate the uncertainty from shape space to 3D motion space, two different approaches have been followed: an analytical and a statistical one. This study has allowed us to determine which degrees of freedom are recovered with more accuracy, and what correlations exist between the different motion components. Finally, an algorithm to propagate the motion uncertainty at video rate has been proposed.One of the most important limitations of this methodology is that the object must project onto the image under weak-perspective visualization conditions all along the sequence. In the second part of this work, active contour tracking is studied within the framework of active vision to overcome this limitation. Both relate naturally, as active contour tracking can be seen as a focus-of-attention strategy.First, the properties of zooming cameras are studied and a new algorithm is proposed to estimate the depth of the camera with respect to an object. The algorithm includes a simple geometric calibration that does not require any knowledge about the camera internal parameters.Finally, in order to orientate the camera so as to suitably compensate for robot motion when possible, a new algorithm has been proposed for the control of zoom, pan and tilt mechanisms, and the motion estimation algorithm has been updated conveniently to incorporate the active camera state information

    Template reduction of feature point models for rigid objects and application to tracking in microscope images.

    Get PDF
    This thesis addresses the problem of tracking rigid objects in video sequences. A novel approach to reducing the template size of shapes is presented. The reduced shape template can be used to enhance the performance of tracking, detection and recognition algorithms. The main idea consists of pre-calculating all possible positions and orientations that a shape can undergo for a given state space. From these states, it is possible to extract a set of points that uniquely and robustly characterises the shape for the considered state space. An algorithm, based on the Hough transform, has been developed to achieve this for discrete shapes, i.e. sets of points, projected in an image when the state space is bounded. An extended discussion on particle filters, that serves as an introduction to the topic, is presented, as well as some generic improvements. The introduction of these improvements allow the data to be better sampled by incorporating additional measurements and knowledge about the velocity of the tracked object. A partial re-initialisation scheme is also presented that enables faster recovery of the system when the object is temporarily occluded.A stencil estimator is introduced to identify the position of an object in an image. Some of its properties are discussed and demonstrated. The estimator can be efficiently evaluated using the bounded Hough transform algorithm. The performance of the stencilled Hough transform can be further enhanced with a methodology that decimates the stencils while maintaining the robustness of the tracker. Performance evaluations have demonstrated the relevance of the approach. Although the methods presented in this thesis could be adapted to full 3-D object motion, motions that maintain the same view of the object in front of a camera are more specifically studied

    Commande référencée vision pour drones à décollages et atterrissages verticaux

    Get PDF
    La miniaturisation des calculateurs a permis le développement des drones, engins volants capable de se déplacer de façon autonome et de rendre des services, comme se rendre clans des lieux peu accessibles ou remplacer l'homme dans des missions pénibles. Un enjeu essentiel dans ce cadre est celui de l'information qu'ils doivent utiliser pour se déplacer, et donc des capteurs à exploiter pour obtenir cette information. Or nombre de ces capteurs présentent des inconvénients (risques de brouillage ou de masquage en particulier). L'utilisation d'une caméra vidéo dans ce contexte offre une perspective intéressante. L'objet de cette thèse était l'étude de l'utilisation d'une telle caméra dans un contexte capteur minimaliste: essentiellement l'utilisation des données visuelles et inertielles. Elle a porté sur le développement de lois de commande offrant au système ainsi bouclé des propriétés de stabilité et de robustesse. En particulier, une des difficultés majeures abordées vient de la connaissance très limitée de l'environnement dans lequel le drone évolue. La thèse a tout d'abord étudié le problème de stabilisation du drone sous l'hypothèse de petits déplacements (hypothèse de linéarité). Dans un second temps, on a montré comment relâcher l'hypothèse de petits déplacements via la synthèse de commandes non linéaires. Le cas du suivi de trajectoire a ensuite été considéré, en s'appuyant sur la définition d'un cadre générique de mesure d'erreur de position par rapport à un point de référence inconnu. Enfin, la validation expérimentale de ces résultats a été entamée pendant la thèse, et a permis de valider bon nombre d'étapes et de défis associés à leur mise en œuvre en conditions réelles. La thèse se conclut par des perspectives pour poursuivre les travaux.The computers miniaturization has paved the way for the conception of Unmanned Aerial vehicles - "UAVs"- that is: flying vehicles embedding computers to make them partially or fully automated for such missions as e.g. cluttered environments exploration or replacement of humanly piloted vehicles for hazardous or painful missions. A key challenge for the design of such vehicles is that of the information they need to find in order to move, and, thus, the sensors to be used in order to get such information. A number of such sensors have flaws (e.g. the risk of being jammed). In this context, the use of a videocamera offers interesting prospectives. The goal of this PhD work was to study the use of such a videocamera in a minimal sensors setting: essentially the use of visual and inertial data. The work has been focused on the development of control laws offering the closed loop system stability and robustness properties. In particular, one of the major difficulties we faced came from the limited knowledge of the UAV environment. First we have studied this question under a small displacements assumption (linearity assumption). A control law has been defined, which took performance criteria into account. Second, we have showed how the small displacements assumption could be given up through nonlinear control design. The case of a trajectory following has then been considered, with the use of a generic error vector modelling with respect to an unknown reference point. Finally, an experimental validation of this work has been started and helped validate a number of steps and challenges associated to real conditions experiments. The work was concluded with prospectives for future work.TOULOUSE-ISAE (315552318) / SudocSudocFranceF

    Application of lie algebras to visual servoing

    No full text
    corecore