2,807 research outputs found

    Dense Piecewise Planar RGB-D SLAM for Indoor Environments

    Full text link
    The paper exploits weak Manhattan constraints to parse the structure of indoor environments from RGB-D video sequences in an online setting. We extend the previous approach for single view parsing of indoor scenes to video sequences and formulate the problem of recovering the floor plan of the environment as an optimal labeling problem solved using dynamic programming. The temporal continuity is enforced in a recursive setting, where labeling from previous frames is used as a prior term in the objective function. In addition to recovery of piecewise planar weak Manhattan structure of the extended environment, the orthogonality constraints are also exploited by visual odometry and pose graph optimization. This yields reliable estimates in the presence of large motions and absence of distinctive features to track. We evaluate our method on several challenging indoors sequences demonstrating accurate SLAM and dense mapping of low texture environments. On existing TUM benchmark we achieve competitive results with the alternative approaches which fail in our environments.Comment: International Conference on Intelligent Robots and Systems (IROS) 201

    On plane-based camera calibration: a general algorithm, singularities, applications

    Get PDF
    We present a general algorithm for plane-based calibration that can deal with arbitrary numbers of views and calibration planes. The algorithm can simultaneously calibrate different views from a camera with variable intrinsic parameters and it is easy to incorporate known values of intrinsic parameters. For some minimal cases, we describe all singularities, naming the parameters that can not be estimated. Experimental results of our method are shown that exhibit the singularities while revealing good performance in non-singular conditions. Several applications of plane-based 3D geometry inference are discussed as wel

    An investigation into common challenges of 3D scene understanding in visual surveillance

    Get PDF
    Nowadays, video surveillance systems are ubiquitous. Most installations simply consist of CCTV cameras connected to a central control room and rely on human operators to interpret what they see on the screen in order to, for example, detect a crime (either during or after an event). Some modern computer vision systems aim to automate the process, at least to some degree, and various algorithms have been somewhat successful in certain limited areas. However, such systems remain inefficient in general circumstances and present real challenges yet to be solved. These challenges include the ability to recognise and ultimately predict and prevent abnormal behaviour or even reliably recognise objects, for example in order to detect left luggage or suspicious objects. This thesis first aims to study the state-of-the-art and identify the major challenges and possible requirements of future automated and semi-automated CCTV technology in the field. This thesis presents the application of a suite of 2D and highly novel 3D methodologies that go some way to overcome current limitations.The methods presented here are based on the analysis of object features directly extracted from the geometry of the scene and start with a consideration of mainly existing techniques, such as the use of lines, vanishing points (VPs) and planes, applied to real scenes. Then, an investigation is presented into the use of richer 2.5D/3D surface normal data. In all cases the aim is to combine both 2D and 3D data to obtain a better understanding of the scene, aimed ultimately at capturing what is happening within the scene in order to be able to move towards automated scene analysis. Although this thesis focuses on the widespread application of video surveillance, an example case of the railway station environment is used to represent typical real-world challenges, where the principles can be readily extended elsewhere, such as to airports, motorways, the households, shopping malls etc. The context of this research work, together with an overall presentation of existing methods used in video surveillance and their challenges are described in chapter 1.Common computer vision techniques such as VP detection, camera calibration, 3D reconstruction, segmentation etc., can be applied in an effort to extract meaning to video surveillance applications. According to the literature, these methods have been well researched and their use will be assessed in the context of current surveillance requirements in chapter 2. While existing techniques can perform well in some contexts, such as an architectural environment composed of simple geometrical elements, their robustness and performance in feature extraction and object recognition tasks is not sufficient to solve the key challenges encountered in general video surveillance context. This is largely due to issues such as variable lighting, weather conditions, and shadows and in general complexity of the real-world environment. Chapter 3 presents the research and contribution on those topics – methods to extract optimal features for a specific CCTV application – as well as their strengths and weaknesses to highlight that the proposed algorithm obtains better results than most due to its specific design.The comparison of current surveillance systems and methods from the literature has shown that 2D data are however almost constantly used for many applications. Indeed, industrial systems as well as the research community have been improving intensively 2D feature extraction methods since image analysis and Scene understanding has been of interest. The constant progress on 2D feature extraction methods throughout the years makes it almost effortless nowadays due to a large variety of techniques. Moreover, even if 2D data do not allow solving all challenges in video surveillance or other applications, they are still used as starting stages towards scene understanding and image analysis. Chapter 4 will then explore 2D feature extraction via vanishing point detection and segmentation methods. A combination of most common techniques and a novel approach will be then proposed to extract vanishing points from video surveillance environments. Moreover, segmentation techniques will be explored in the aim to determine how they can be used to complement vanishing point detection and lead towards 3D data extraction and analysis. In spite of the contribution above, 2D data is insufficient for all but the simplest applications aimed at obtaining an understanding of a scene, where the aim is for a robust detection of, say, left luggage or abnormal behaviour; without significant a priori information about the scene geometry. Therefore, more information is required in order to be able to design a more automated and intelligent algorithm to obtain richer information from the scene geometry and so a better understanding of what is happening within. This can be overcome by the use of 3D data (in addition to 2D data) allowing opportunity for object “classification” and from this to infer a map of functionality, describing feasible and unfeasible object functionality in a given environment. Chapter 5 presents how 3D data can be beneficial for this task and the various solutions investigated to recover 3D data, as well as some preliminary work towards plane extraction.It is apparent that VPs and planes give useful information about a scene’s perspective and can assist in 3D data recovery within a scene. However, neither VPs nor plane detection techniques alone allow the recovery of more complex generic object shapes - for example composed of spheres, cylinders etc - and any simple model will suffer in the presence of non-Manhattan features, e.g. introduced by the presence of an escalator. For this reason, a novel photometric stereo-based surface normal retrieval methodology is introduced to capture the 3D geometry of the whole scene or part of it. Chapter 6 describes how photometric stereo allows recovery of 3D information in order to obtain a better understanding of a scene, as well as also partially overcoming some current surveillance challenges, such as difficulty in resolving fine detail, particularly at large standoff distances, and in isolating and recognising more complex objects in real scenes. Here items of interest may be obscured by complex environmental factors that are subject to rapid change, making, for example, the detection of suspicious objects and behaviour highly problematic. Here innovative use is made of an untapped latent capability offered within modern surveillance environments to introduce a form of environmental structuring to good advantage in order to achieve a richer form of data acquisition. This chapter also goes on to explore the novel application of photometric stereo in such diverse applications, how our algorithm can be incorporated into an existing surveillance system and considers a typical real commercial application.One of the most important aspects of this research work is its application. Indeed, while most of the research literature has been based on relatively simple structured environments, the approach here has been designed to be applied to real surveillance environments, such as railway stations, airports, waiting rooms, etc, and where surveillance cameras may be fixed or in the future form part of a mobile robotic free roaming surveillance device, that must continually reinterpret its changing environment. So, as mentioned previously, while the main focus has been to apply this algorithm to railway station environments, the work has been approached in a way that allows adaptation to many other applications, such as autonomous robotics, and in motorway, shopping centre, street and home environments. All of these applications require a better understanding of the scene for security or safety purposes. Finally, chapter 7 presents a global conclusion and what will be achieved in the future

    Coupling Vanishing Point Tracking with Inertial Navigation to Estimate Attitude in a Structured Environment

    Get PDF
    This research aims to obtain accurate and stable estimates of a vehicle\u27s attitude by coupling consumer-grade inertial and optical sensors. This goal is pursued by first modeling both inertial and optical sensors and then developing a technique for identifying vanishing points in perspective images of a structured environment. The inertial and optical processes are then coupled to enable each one to aid the other. The vanishing point measurements are combined with the inertial data in an extended Kalman filter to produce overall attitude estimates. This technique is experimentally demonstrated in an indoor corridor setting using a motion profile designed to simulate flight. Through comparison with a tactical-grade inertial sensor, the combined consumer-grade inertial and optical data are shown to produce a stable attitude solution accurate to within 1.5 degrees. A measurement bias is manifested which degrades the accuracy by up to another 2.5 degrees

    Towards A Self-calibrating Video Camera Network For Content Analysis And Forensics

    Get PDF
    Due to growing security concerns, video surveillance and monitoring has received an immense attention from both federal agencies and private firms. The main concern is that a single camera, even if allowed to rotate or translate, is not sufficient to cover a large area for video surveillance. A more general solution with wide range of applications is to allow the deployed cameras to have a non-overlapping field of view (FoV) and to, if possible, allow these cameras to move freely in 3D space. This thesis addresses the issue of how cameras in such a network can be calibrated and how the network as a whole can be calibrated, such that each camera as a unit in the network is aware of its orientation with respect to all the other cameras in the network. Different types of cameras might be present in a multiple camera network and novel techniques are presented for efficient calibration of these cameras. Specifically: (i) For a stationary camera, we derive new constraints on the Image of the Absolute Conic (IAC). These new constraints are shown to be intrinsic to IAC; (ii) For a scene where object shadows are cast on a ground plane, we track the shadows on the ground plane cast by at least two unknown stationary points, and utilize the tracked shadow positions to compute the horizon line and hence compute the camera intrinsic and extrinsic parameters; (iii) A novel solution to a scenario where a camera is observing pedestrians is presented. The uniqueness of formulation lies in recognizing two harmonic homologies present in the geometry obtained by observing pedestrians; (iv) For a freely moving camera, a novel practical method is proposed for its self-calibration which even allows it to change its internal parameters by zooming; and (v) due to the increased application of the pan-tilt-zoom (PTZ) cameras, a technique is presented that uses only two images to estimate five camera parameters. For an automatically configurable multi-camera network, having non-overlapping field of view and possibly containing moving cameras, a practical framework is proposed that determines the geometry of such a dynamic camera network. It is shown that only one automatically computed vanishing point and a line lying on any plane orthogonal to the vertical direction is sufficient to infer the geometry of a dynamic network. Our method generalizes previous work which considers restricted camera motions. Using minimal assumptions, we are able to successfully demonstrate promising results on synthetic as well as on real data. Applications to path modeling, GPS coordinate estimation, and configuring mixed-reality environment are explored

    Sensing dynamic displacements in masonry rail bridges using 2D digital image correlation

    Get PDF
    Dynamic displacement measurements provide useful information for the assessment of masonry rail bridges, which constitute a significant part of the bridge stock in the UK and Europe. Commercial 2D Digital Image Correlation (DIC) techniques are well suited for this purpose. These systems provide precise non-contact displacement measurements simultaneously at many locations of the bridge with an easily configured camera setup. However, various sources of errors can affect the resolution, repeatability and accuracy of DIC field measurements. Typically, these errors are application specific and are not automatically corrected by commercial software. To address this limitation, this paper presents a survey of relevant DIC errors and discusses methods to minimise the influence of these errors during equipment setup and data processing. A case study application of DIC for multi-point displacement measurement of a masonry viaduct in Leeds is then described, where potential errors due to lighting changes, image texture and camera movements are minimised with an appropriate setup. Pixel-metric scaling errors are kept to a minimum with the use of a calibration method which utilises vanishing points in the image. However, comparisons of DIC relative displacement measurements to complementary strain measurements from the bridge demonstrate that other errors may have significant influence on the DIC measurement accuracy. Therefore the influence of measurement errors due to lens radial distortion and out of plane movements are quantified theoretically with pinhole camera and division distortion models. A method to correct for errors due to potential out of plane movements is then proposed

    Calibration and Sensitivity Analysis of a Stereo Vision-Based Driver Assistance System

    Get PDF
    Az http://intechweb.org/ alatti "Books" fül alatt kell rákeresni a "Stereo Vision" címre és az 1. fejezetre

    Gait analysis, modelling, and comparison from unconstrained walks and viewpoints : view-rectification of body-part trajectories from monocular video sequences

    Get PDF
    L'analyse, la modélisation et la comparaison de la démarche de personnes à l'aide d'algorithmes de vision artificielle a récemment suscité beaucoup d'intérêt dans les domaines d'applications médicales et de surveillance. Il y a en effet plusieurs avantages à utiliser des algorithmes de vision artificielle pour faire l'analyse, la modélisation et la comparaison de la démarche de personnes. Par exemple, la démarche d'une personne peut être analysée et modélisée de loin en observant la personne à l'aide d'une caméra, ce qui ne requiert pas le placement de marqueurs ou de senseurs sur la personne. De plus, la coopération des personnes observées n'est pas requise, ce qui permet d'utiliser la démarche des personnes comme un facteur d'identification biométrique dans les systèmes de surveillance automatique. Les méthodes d'analyse et de modélisation de la démarche existantes comportent toutefois plusieurs limitations. Plusieurs de ces méthodes nécessitent une vue de profil des personnes puisque ce point de vue est optimal pour l'analyse et la modélisation de la démarche. La plupart de ces méthodes supposent également une distance assez grande entre les personnes et la caméra afin de limiter les effets néfastes que la projection de perspective a sur l'analyse et la modélisation de la démarche. Par ailleurs, ces méthodes ne gèrent pas les changements de direction et de vitesse dans les marches. Cela limite grandement les marches pouvant être analysées et modélisées dans les applications médicales et les applications de surveillance. L'approche proposée dans cette thèse permet d'effectuer l'analyse, la modélisation et la comparaison de la démarche de personnes à partir de marches et de points de vue non contraints. L'approche proposée est principalement constituée d'une méthode de rectification du point de vue qui permet de générer une vue fronto-parallèle (vue de profil) de la trajectoire imagée des membres d'une personne. Cette méthode de rectification de la vue est basée sur un modèle de marche novateur qui utilise la géométrie projective pour faire les liens spatio-temporels entre la position des membres dans la scène et leur contrepartie dans les images provenant d'une caméra. La tête et les pieds sont les seuls membres nécessaires à l'approche proposée dans cette thèse. La position et le suivi de ces membres sont automatiquement effectués par un algorithme de suivi des membres développé dans le cadre de cette thèse. L'analyse de la démarche est effectuée par une nouvelle méthode qui extrait des caractéristiques de la démarche à partir de la trajectoire rectifiée des membres. Un nouveau modèle de la démarche basé sur la trajectoire rectifiée des membres est proposé afin de permettre la modélisation et la comparaison de la démarche en utilisant les caractéristiques dynamiques de la démarche. L'approche proposée dans cette thèse est premièrement validée à l'aide de marches synthétiques comprenant plusieurs points de vue différents ainsi que des changements de direction. Les résultats de cette étape de validation montrent que la méthode de rectification de la vue fonctionne correctement, et qu'il est possible d'extraire des caractéristiques de la démarche valides à partir de la trajectoire rectifiée des membres. Par la suite, l'analyse, la modélisation et la comparaison de la démarche de personnes sont effectuées sur des marches réelles qui ont été acquises dans le cadre de cette thèse. Ces marches sont particulièrement difficiles à analyser et à modéliser puisqu'elles ont été effectuées près de la caméra et qu'elles comportent des changements de direction et de vitesse. Les résultats d'analyse de la démarche confirment que les caractéristiques de la démarche obtenues à l'aide de la méthode proposée sont réalistes et sont en accord avec les résultats présentés dans les études cliniques de la démarche. Les résultats de modélisation et de comparaison de la démarche démontrent qu'il est possible d'utiliser la méthode proposée pour reconnaître des personnes par leur démarche dans le contexte des applications de surveillance. Les taux de reconnaissance obtenus sont bons considérant la complexité des marches utilisées dans cette thèse.Gait analysis, modelling and comparison using computer vision algorithms has recently attracted much attention for medical and surveillance applications. Analyzing and modelling a person's gait with computer vision algorithms has indeed some interesting advantages over more traditional biometrics. For instance, gait can be analyzed and modelled at a distance by observing the person with a camera, which means that no markers or sensors have to be worn by the person. Moreover, gait analysis and modelling using computer vision algorithms does not require the cooperation of the observed people, which thus allows for using gait as a biometric in surveillance applications. Current gait analysis and modelling approaches have however severe limitations. For instance, several approaches require a side view of the walks since this viewpoint is optimal for gait analysis and modelling. Most approaches also require the walks to be observed far enough from the camera in order to avoid perspective distortion effects that would badly affect the resulting gait analyses and models. Moreover, current approaches do not allow for changes in walk direction and in walking speed, which greatly constraints the walks that can be analyzed and modelled in medical and surveillance applications. The approach proposed in this thesis aims at performing gait analysis, modelling and comparison from unconstrained walks and viewpoints in medical and surveillance applications. The proposed approach mainly consists in a novel view-rectification method that generates a fronto-parallel viewpoint (side view) of the imaged trajectories of body parts. The view-rectification method is based on a novel walk model that uses projective geometry to provide the spatio-temporal links between the body-part positions in the scene and their corresponding positions in the images. The head and the feet are the only body parts that are relevant for the proposed approach. They are automatically localized and tracked in monocular video sequences using a novel body parts tracking algorithm. Gait analysis is performed by a novel method that extracts standard gait measurements from the view-rectified body-part trajectories. A novel gait model based on body-part trajectories is also proposed in order to perform gait modelling and comparison using the dynamics of the gait. The proposed approach is first validated using synthetic walks comprising different viewpoints and changes in the walk direction. The validation results shows that the proposed view-rectification method works well, that is, valid gait measurements can be extracted from the view-rectified body-part trajectories. Next, gait analysis, modelling, and comparison is performed on real walks acquired as part of this thesis. These walks are challenging since they were performed close to the camera and contain changes in walk direction and in walking speed. The results first show that the obtained gait measurements are realistic and correspond to the gait measurements found in references on clinical gait analysis. The gait comparison results then show that the proposed approach can be used to perform gait modelling and comparison in the context of surveillance applications by recognizing people by their gait. The computed recognition rates are quite good considering the challenging walks used in this thesis

    Circular motion geometry using minimal data

    Full text link

    3D Reconstruction of Indoor Corridor Models Using Single Imagery and Video Sequences

    Get PDF
    In recent years, 3D indoor modeling has gained more attention due to its role in decision-making process of maintaining the status and managing the security of building indoor spaces. In this thesis, the problem of continuous indoor corridor space modeling has been tackled through two approaches. The first approach develops a modeling method based on middle-level perceptual organization. The second approach develops a visual Simultaneous Localisation and Mapping (SLAM) system with model-based loop closure. In the first approach, the image space was searched for a corridor layout that can be converted into a geometrically accurate 3D model. Manhattan rule assumption was adopted, and indoor corridor layout hypotheses were generated through a random rule-based intersection of image physical line segments and virtual rays of orthogonal vanishing points. Volumetric reasoning, correspondences to physical edges, orientation map and geometric context of an image are all considered for scoring layout hypotheses. This approach provides physically plausible solutions while facing objects or occlusions in a corridor scene. In the second approach, Layout SLAM is introduced. Layout SLAM performs camera localization while maps layout corners and normal point features in 3D space. Here, a new feature matching cost function was proposed considering both local and global context information. In addition, a rotation compensation variable makes Layout SLAM robust against cameras orientation errors accumulations. Moreover, layout model matching of keyframes insures accurate loop closures that prevent miss-association of newly visited landmarks to previously visited scene parts. The comparison of generated single image-based 3D models to ground truth models showed that average ratio differences in widths, heights and lengths were 1.8%, 3.7% and 19.2% respectively. Moreover, Layout SLAM performed with the maximum absolute trajectory error of 2.4m in position and 8.2 degree in orientation for approximately 318m path on RAWSEEDS data set. Loop closing was strongly performed for Layout SLAM and provided 3D indoor corridor layouts with less than 1.05m displacement errors in length and less than 20cm in width and height for approximately 315m path on York University data set. The proposed methods can successfully generate 3D indoor corridor models compared to their major counterpart
    • …
    corecore