290 research outputs found

    Computational Methods and Graphical Processing Units for Real-time Control of Tomographic Adaptive Optics on Extremely Large Telescopes.

    Get PDF
    Ground based optical telescopes suffer from limited imaging resolution as a result of the effects of atmospheric turbulence on the incoming light. Adaptive optics technology has so far been very successful in correcting these effects, providing nearly diffraction limited images. Extremely Large Telescopes will require more complex Adaptive Optics configurations that introduce the need for new mathematical models and optimal solvers. In addition, the amount of data to be processed in real time is also greatly increased, making the use of conventional computational methods and hardware inefficient, which motivates the study of advanced computational algorithms, and implementations on parallel processors. Graphical Processing Units (GPUs) are massively parallel processors that have so far demonstrated a very high increase in speed compared to CPUs and other devices, and they have a high potential to meet the real-time restrictions of adaptive optics systems. This thesis focuses on the study and evaluation of existing proposed computational algorithms with respect to computational performance, and their implementation on GPUs. Two basic methods, one direct and one iterative are implemented and tested and the results presented provide an evaluation of the basic concept upon which other algorithms are based, and demonstrate the benefits of using GPUs for adaptive optics

    Autonomous Localization Of A Uav In A 3d Cad Model

    Get PDF
    This thesis presents a novel method of indoor localization and autonomous navigation of Unmanned Aerial Vehicles(UAVs) within a building, given a prebuilt Computer Aided Design(CAD) model of the building. The proposed system is novel in that it leverages the support of machine learning and traditional computer vision techniques to provide a robust method of localizing and navigating a drone autonomously in indoor and GPS denied environments leveraging preexisting knowledge of the environment. The goal of this work is to devise a method to enable a UAV to deduce its current pose within a CAD model that is fast and accurate while also maintaining efficient use of resources. A 3-Dimensional CAD model of the building to be navigated through is provided as input to the system along with the required goal position. Initially, the UAV has no idea of its location within the building. The system, comprising a stereo camera system and an Inertial Measurement Unit(IMU) as its sensors, then generates a globally consistent map of its surroundings using a Simultaneous Localization and Mapping (SLAM) algorithm. In addition to the map, it also stores spatially correlated 3D features. These 3D features are then used to generate correspondences between the SLAM map and the 3D CAD model. The correspondences are then used to generate a transformation between the SLAM map and the 3D CAD model, thus effectively localizing the UAV in the 3D CAD model. Our method has been tested to successfully localize the UAV in the test building in an average of 15 seconds in the different scenarios tested contingent upon the abundance of target features in the observed data. Due to the absence of a motion capture system, the results have been verified by the placement of tags on the ground at strategic known locations in the building and measuring the error in the projection of the current UAV location on the ground with the tag

    Registering a Non-Rigid Multi-Sensor Ensemble of Images

    Get PDF
    Image registration is the task of aligning two or more images into the same reference frame to compare or distinguish the images. The majority of registration methods deal with registering only two images at a time. Recently, a clustering method that concurrently registers more than two multi-sensor images was proposed, dubbed ensemble clustering. In this thesis, we apply the ensemble clustering method to deformable registration scenario for the first time. Non-rigid deformation is implemented by a FFD model based on B-splines. A regularization term is added to the cost function of the method to limit the topology and degree of the allowable deformations. However, the increased degrees of freedom in the transformations caused the Newton-type optimization process to become ill-conditioned. This made the registration process unstable. We solved this problem by using the matrix approximation afforded by the singular value decomposition (SVD). Experiments showed that the method is successfully applied to non-rigid multi-sensor ensembles and overall yields better registration results than methods that register only 2 images at a time. In addition, we parallelized the ensemble clustering method to accelerate the performance of the method. The parallelization was implemented on GPUs using CUDA (Compute Unified Device Architecture) programming model. The GPU implementation greatly reduced the running time of the method

    Bandwidth-aware distributed ad-hoc grids in deployed wireless sensor networks

    Get PDF
    Nowadays, cost effective sensor networks can be deployed as a result of a plethora of recent engineering advances in wireless technology, storage miniaturisation, consolidated microprocessor design, and sensing technologies. Whilst sensor systems are becoming relatively cheap to deploy, two issues arise in their typical realisations: (i) the types of low-cost sensors often employed are capable of limited resolution and tend to produce noisy data; (ii) network bandwidths are relatively low and the energetic costs of using the radio to communicate are relatively high. To reduce the transmission of unnecessary data, there is a strong argument for performing local computation. However, this can require greater computational capacity than is available on a single low-power processor. Traditionally, such a problem has been addressed by using load balancing: fragmenting processes into tasks and distributing them amongst the least loaded nodes. However, the act of distributing tasks, and any subsequent communication between them, imposes a geographically defined load on the network. Because of the shared broadcast nature of the radio channels and MAC layers in common use, any communication within an area will be slowed by additional traffic, delaying the computation and reporting that relied on the availability of the network. In this dissertation, we explore the tradeoff between the distribution of computation, needed to enhance the computational abilities of networks of resource-constrained nodes, and the creation of network traffic that results from that distribution. We devise an application-independent distribution paradigm and a set of load distribution algorithms to allow computationally intensive applications to be collaboratively computed on resource-constrained devices. Then, we empirically investigate the effects of network traffic information on the distribution performance. We thus devise bandwidth-aware task offload mechanisms that, combining both nodes computational capabilities and local network conditions, investigate the impacts of making informed offload decisions on system performance. The highly deployment-specific nature of radio communication means that simulations that are capable of producing validated, high-quality, results are extremely hard to construct. Consequently, to produce meaningful results, our experiments have used empirical analysis based on a network of motes located at UCL, running a variety of I/O-bound, CPU-bound and mixed tasks. Using this setup, we have established that even relatively simple load sharing algorithms can improve performance over a range of different artificially generated scenarios, with more or less timely contextual information. In addition, we have taken a realistic application, based on location estimation, and implemented that across the same network with results that support the conclusions drawn from the artificially generated traffic

    Real-time people tracking in a camera network

    Get PDF
    Visual tracking is a fundamental key to the recognition and analysis of human behaviour. In this thesis we present an approach to track several subjects using multiple cameras in real time. The tracking framework employs a numerical Bayesian estimator, also known as a particle lter, which has been developed for parallel implementation on a Graphics Processing Unit (GPU). In order to integrate multiple cameras into a single tracking unit we represent the human body by a parametric ellipsoid in a 3D world. The elliptical boundary can be projected rapidly, several hundred times per subject per frame, onto any image for comparison with the image data within a likelihood model. Adding variables to encode visibility and persistence into the state vector, we tackle the problems of distraction and short-period occlusion. However, subjects may also disappear for longer periods due to blind spots between cameras elds of view. To recognise a desired subject after such a long-period, we add coloured texture to the ellipsoid surface, which is learnt and retained during the tracking process. This texture signature improves the recall rate from 60% to 70-80% when compared to state only data association. Compared to a standard Central Processing Unit (CPU) implementation, there is a signi cant speed-up ratio

    Distributed Monocular SLAM for Indoor Map Building

    Get PDF
    Utilization and generation of indoor maps are critical elements in accurate indoor tracking. Simultaneous Localization and Mapping (SLAM) is one of the main techniques for such map generation. In SLAM an agent generates a map of an unknown environment while estimating its location in it. Ubiquitous cameras lead to monocular visual SLAM, where a camera is the only sensing device for the SLAM process. In modern applications, multiple mobile agents may be involved in the generation of such maps, thus requiring a distributed computational framework. Each agent can generate its own local map, which can then be combined into a map covering a larger area. By doing so, they can cover a given environment faster than a single agent. Furthermore, they can interact with each other in the same environment, making this framework more practical, especially for collaborative applications such as augmented reality. One of the main challenges of distributed SLAM is identifying overlapping maps, especially when relative starting positions of agents are unknown. In this paper, we are proposing a system having multiple monocular agents, with unknown relative starting positions, which generates a semidense global map of the environment

    Scalable learning for geostatistics and speaker recognition

    Get PDF
    With improved data acquisition methods, the amount of data that is being collected has increased severalfold. One of the objectives in data collection is to learn useful underlying patterns. In order to work with data at this scale, the methods not only need to be effective with the underlying data, but also have to be scalable to handle larger data collections. This thesis focuses on developing scalable and effective methods targeted towards different domains, geostatistics and speaker recognition in particular. Initially we focus on kernel based learning methods and develop a GPU based parallel framework for this class of problems. An improved numerical algorithm that utilizes the GPU parallelization to further enhance the computational performance of kernel regression is proposed. These methods are then demonstrated on problems arising in geostatistics and speaker recognition. In geostatistics, data is often collected at scattered locations and factors like instrument malfunctioning lead to missing observations. Applications often require the ability interpolate this scattered spatiotemporal data on to a regular grid continuously over time. This problem can be formulated as a regression problem, and one of the most popular geostatistical interpolation techniques, kriging is analogous to a standard kernel method: Gaussian process regression. Kriging is computationally expensive and needs major modifications and accelerations in order to be used practically. The GPU framework developed for kernel methods is extended to kriging and further the GPU's texture memory is better utilized for enhanced computational performance. Speaker recognition deals with the task of verifying a person's identity based on samples of his/her speech - "utterances". This thesis focuses on text-independent framework and three new recognition frameworks were developed for this problem. We proposed a kernelized Renyi distance based similarity scoring for speaker recognition. While its performance is promising, it does not generalize well for limited training data and therefore does not compare well to state-of-the-art recognition systems. These systems compensate for the variability in the speech data due to the message, channel variability, noise and reverberation. State-of-the-art systems model each speaker as a mixture of Gaussians (GMM) and compensate for the variability (termed "nuisance"). We propose a novel discriminative framework using a latent variable technique, partial least squares (PLS), for improved recognition. The kernelized version of this algorithm is used to achieve a state of the art speaker ID system, that shows results competitive with the best systems reported on in NIST's 2010 Speaker Recognition Evaluation

    Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond

    Full text link
    In this and a set of companion whitepapers, the USQCD Collaboration lays out a program of science and computing for lattice gauge theory. These whitepapers describe how calculation using lattice QCD (and other gauge theories) can aid the interpretation of ongoing and upcoming experiments in particle and nuclear physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers
    • …
    corecore