4,729 research outputs found

    Online Structured Learning for Real-Time Computer Vision Gaming Applications

    Get PDF
    In recent years computer vision has played an increasingly important role in the development of computer games, and it now features as one of the core technologies for many gaming platforms. The work in this thesis addresses three problems in real-time computer vision, all of which are motivated by their potential application to computer games. We rst present an approach for real-time 2D tracking of arbitrary objects. In common with recent research in this area we incorporate online learning to provide an appearance model which is able to adapt to the target object and its surrounding background during tracking. However, our approach moves beyond the standard framework of tracking using binary classication and instead integrates tracking and learning in a more principled way through the use of structured learning. As well as providing a more powerful framework for adaptive visual object tracking, our approach also outperforms state-of-the-art tracking algorithms on standard datasets. Next we consider the task of keypoint-based object tracking. We take the traditional pipeline of matching keypoints followed by geometric verication and show how this can be embedded into a structured learning framework in order to provide principled adaptivity to a given environment. We also propose an approximation method allowing us to take advantage of recently developed binary image descriptors, meaning our approach is suitable for real-time application even on low-powered portable devices. Experimentally, we clearly see the benet that online adaptation using structured learning can bring to this problem. Finally, we present an approach for approximately recovering the dense 3D structure of a scene which has been mapped by a simultaneous localisation and mapping system. Our approach is guided by the constraints of the low-powered portable hardware we are targeting, and we develop a system which coarsely models the scene using a small number of planes. To achieve this, we frame the task as a structured prediction problem and introduce online learning into our approach to provide adaptivity to a given scene. This allows us to use relatively simple multi-view information coupled with online learning of appearance to efficiently produce coarse reconstructions of a scene

    C-LOG: A Chamfer Distance based method for localisation in occupancy grid-maps

    Full text link
    In this paper, the problem of localising a robot within a known two-dimensional environment is formulated as one of minimising the Chamfer Distance between the corresponding occupancy grid map and information gathered from a sensor such as a laser range finder. It is shown that this nonlinear optimisation problem can be solved efficiently and that the resulting localisation algorithm has a number of attractive characteristics when compared with the conventional particle filter based solution for robot localisation in occupancy grids. The proposed algorithm is able to perform well even when robot odometry is unavailable, insensitive to noise models and does not critically depend on any tuning parameters. Experimental results based on a number of public domain datasets as well as data collected by the authors are used to demonstrate the effectiveness of the proposed algorithm. © 2013 IEEE

    An Investigation and Application of Biology and Bioinformatics for Activity Recognition

    Get PDF
    Activity recognition in a smart home context is inherently difficult due to the variable nature of human activities and tracking artifacts introduced by video-based tracking systems. This thesis addresses the activity recognition problem via introducing a biologically-inspired chemotactic approach and bioinformatics-inspired sequence alignment techniques to recognise spatial activities. The approaches are demonstrated in real world conditions to improve robustness and recognise activities in the presence of innate activity variability and tracking noise

    Diffeomorphic registration using geodesic shooting and Gauss-Newton optimisation

    Get PDF
    This paper presents a nonlinear image registration algorithm based on the setting of Large Deformation Diffeomorphic Metric Mapping (LDDMM). but with a more efficient optimisation scheme - both in terms of memory required and the number of iterations required to reach convergence. Rather than perform a variational optimisation on a series of velocity fields, the algorithm is formulated to use a geodesic shooting procedure, so that only an initial velocity is estimated. A Gauss-Newton optimisation strategy is used to achieve faster convergence. The algorithm was evaluated using freely available manually labelled datasets, and found to compare favourably with other inter-subject registration algorithms evaluated using the same data. (C) 2011 Elsevier Inc. All rights reserved

    Linear Regression and Unsupervised Learning For Tracking and Embodied Robot Control.

    Get PDF
    Computer vision problems, such as tracking and robot navigation, tend to be solved using models of the objects of interest to the problem. These models are often either hard-coded, or learned in a supervised manner. In either case, an engineer is required to identify the visual information that is important to the task, which is both time consuming and problematic. Issues with these engineered systems relate to the ungrounded nature of the knowledge imparted by the engineer, where the systems have no meaning attached to the representations. This leads to systems that are brittle and are prone to failure when expected to act in environments not envisaged by the engineer. The work presented in this thesis removes the need for hard-coded or engineered models of either visual information representations or behaviour. This is achieved by developing novel approaches for learning from example, in both input (percept) and output (action) spaces. This approach leads to the development of novel feature tracking algorithms, and methods for robot control. Applying this approach to feature tracking, unsupervised learning is employed, in real time, to build appearance models of the target that represent the input space structure, and this structure is exploited to partition banks of computationally efficient, linear regression based target displacement estimators. This thesis presents the first application of regression based methods to the problem of simultaneously modeling and tracking a target object. The computationally efficient Linear Predictor (LP) tracker is investigated, along with methods for combining and weighting flocks of LP’s. The tracking algorithms developed operate with accuracy comparable to other state of the art online approaches and with a significant gain in computational efficiency. This is achieved as a result of two specific contributions. First, novel online approaches for the unsupervised learning of modes of target appearance that identify aspects of the target are introduced. Second, a general tracking framework is developed within which the identified aspects of the target are adaptively associated to subsets of a bank of LP trackers. This results in the partitioning of LP’s and the online creation of aspect specific LP flocks that facilitate tracking through significant appearance changes. Applying the approach to the percept action domain, unsupervised learning is employed to discover the structure of the action space, and this structure is used in the formation of meaningful perceptual categories, and to facilitate the use of localised input-output (percept-action) mappings. This approach provides a realisation of an embodied and embedded agent that organises its perceptual space and hence its cognitive process based on interactions with its environment. Central to the proposed approach is the technique of clustering an input-output exemplar set, based on output similarity, and using the resultant input exemplar groupings to characterise a perceptual category. All input exemplars that are coupled to a certain class of outputs form a category - the category of a given affordance, action or function. In this sense the formed perceptual categories have meaning and are grounded in the embodiment of the agent. The approach is shown to identify the relative importance of perceptual features and is able to solve percept-action tasks, defined only by demonstration, in previously unseen situations. Within this percept-action learning framework, two alternative approaches are developed. The first approach employs hierarchical output space clustering of point-to-point mappings, to achieve search efficiency and input and output space generalisation as well as a mechanism for identifying the important variance and invariance in the input space. The exemplar hierarchy provides, in a single structure, a mechanism for classifying previously unseen inputs and generating appropriate outputs. The second approach to a percept-action learning framework integrates the regression mappings used in the feature tracking domain, with the action space clustering and imitation learning techniques developed in the percept-action domain. These components are utilised within a novel percept-action data mining methodology, that is able to discover the visual entities that are important to a specific problem, and to map from these entities onto the action space. Applied to the robot control task, this approach allows for real-time generation of continuous action signals, without the use of any supervision or definition of representations or rules of behaviour

    Multiobjective metaheuristic approaches for mean-risk combinatorial optimisation with applications to capacity expansion

    Get PDF
    Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200

    Robust Modular Feature-Based Terrain-Aided Visual Navigation and Mapping

    Get PDF
    The visual feature-based Terrain-Aided Navigation (TAN) system presented in this thesis addresses the problem of constraining inertial drift introduced into the location estimate of Unmanned Aerial Vehicles (UAVs) in GPS-denied environment. The presented TAN system utilises salient visual features representing semantic or human-interpretable objects (roads, forest and water boundaries) from onboard aerial imagery and associates them to a database of reference features created a-priori, through application of the same feature detection algorithms to satellite imagery. Correlation of the detected features with the reference features via a series of the robust data association steps allows a localisation solution to be achieved with a finite absolute bound precision defined by the certainty of the reference dataset. The feature-based Visual Navigation System (VNS) presented in this thesis was originally developed for a navigation application using simulated multi-year satellite image datasets. The extension of the system application into the mapping domain, in turn, has been based on the real (not simulated) flight data and imagery. In the mapping study the full potential of the system, being a versatile tool for enhancing the accuracy of the information derived from the aerial imagery has been demonstrated. Not only have the visual features, such as road networks, shorelines and water bodies, been used to obtain a position ’fix’, they have also been used in reverse for accurate mapping of vehicles detected on the roads into an inertial space with improved precision. Combined correction of the geo-coding errors and improved aircraft localisation formed a robust solution to the defense mapping application. A system of the proposed design will provide a complete independent navigation solution to an autonomous UAV and additionally give it object tracking capability

    Implicit Surfaces For Modelling Human Heads

    No full text
    • …
    corecore