Search CORE

15,066 research outputs found

Tightly Coupled GNSS and Vision Navigation for Unmanned Air Vehicle Applications

Author: O'Shea Peter
Roberts Peter
Walker Rodney
Publication venue: Institution of Engineers, Australia and Royal Aeronautical Society, Australian Division
Publication date: 01/01/2005
Field of study

This paper explores the unique benefits that can be obtained from a tight integration of a GNSS sensor and a forward-looking vision sensor. The motivation of this research is the belief that both GNSS and vision will be integral features of future UAV avionics architectures, GNSS for basic aircraft navigation and vision for obstacle-aircraft collision avoidance. The paper will show that utilising basic single-antenna GNSS measurements and observables, along with aircraft information derived from optical flow techniques creates unique synergies. Results of the accuracy of attitude estimates will be presented, based a comprehensive Matlab® Simulink® model which re-creates an optical flow stream based on the flight of an aircraft. This paper establishes the viability of this novel integrated GNSS/Vision approach for use as the complete UAV sensor package, or as a backup sensor for an inertial navigation system

Queensland University of Technology ePrints Archive

Asynchronous displays for multi-UV search tasks

Author: Carpin S.
Carpin S.
Cooke N.
Gugerty L.
Lewis M.
Miller C.
Olsen D.
Scerri P.
Publication venue
Publication date: 01/06/2012
Field of study

Synchronous video has long been the preferred mode for controlling remote robots with other modes such as asynchronous control only used when unavoidable as in the case of interplanetary robotics. We identify two basic problems for controlling multiple robots using synchronous displays: operator overload and information fusion. Synchronous displays from multiple robots can easily overwhelm an operator who must search video for targets. If targets are plentiful, the operator will likely miss targets that enter and leave unattended views while dealing with others that were noticed. The related fusion problem arises because robots' multiple fields of view may overlap forcing the operator to reconcile different views from different perspectives and form an awareness of the environment by "piecing them together". We have conducted a series of experiments investigating the suitability of asynchronous displays for multi-UV search. Our first experiments involved static panoramas in which operators selected locations at which robots halted and panned their camera to capture a record of what could be seen from that location. A subsequent experiment investigated the hypothesis that the relative performance of the panoramic display would improve as the number of robots was increased causing greater overload and fusion problems. In a subsequent Image Queue system we used automated path planning and also automated the selection of imagery for presentation by choosing a greedy selection of non-overlapping views. A fourth set of experiments used the SUAVE display, an asynchronous variant of the picture-in-picture technique for video from multiple UAVs. The panoramic displays which addressed only the overload problem led to performance similar to synchronous video while the Image Queue and SUAVE displays which addressed fusion as well led to improved performance on a number of measures. In this paper we will review our experiences in designing and testing asynchronous displays and discuss challenges to their use including tracking dynamic targets. © 2012 by the American Institute of Aeronautics and Astronautics, Inc

Crossref

D-Scholarship@Pitt

DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning

Author: He Sheng
Schomaker Lambert
Publication venue: 'Elsevier BV'
Publication date: 17/01/2019
Field of study

This paper presents a novel iterative deep learning framework and apply it for document enhancement and binarization. Unlike the traditional methods which predict the binary label of each pixel on the input image, we train the neural network to learn the degradations in document images and produce the uniform images of the degraded input images, which allows the network to refine the output iteratively. Two different iterative methods have been studied in this paper: recurrent refinement (RR) which uses the same trained neural network in each iteration for document enhancement and stacked refinement (SR) which uses a stack of different neural networks for iterative output refinement. Given the learned uniform and enhanced image, the binarization map can be easy to obtain by a global or local threshold. The experimental results on several public benchmark data sets show that our proposed methods provide a new clean version of the degraded image which is suitable for visualization and promising results of binarization using the global Otsu's threshold based on the enhanced images learned iteratively by the neural network.Comment: Accepted by Pattern Recognitio

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Information recovery from rank-order encoded images

Author: Furber Steve
sen Bhattacharya Basabdatta
Publication venue
Publication date: 04/03/2008
Field of study

The time to detection of a visual stimulus by the primate eye is recorded at 100 – 150ms. This near instantaneous recognition is in spite of the considerable processing required by the several stages of the visual pathway to recognise and react to a visual scene. How this is achieved is still a matter of speculation. Rank-order codes have been proposed as a means of encoding by the primate eye in the rapid transmission of the initial burst of information from the sensory neurons to the brain. We study the efficiency of rank-order codes in encoding perceptually-important information in an image. VanRullen and Thorpe built a model of the ganglion cell layers of the retina to simulate and study the viability of rank-order as a means of encoding by retinal neurons. We validate their model and quantify the information retrieved from rank-order encoded images in terms of the visually-important information recovered. Towards this goal, we apply the ‘perceptual information preservation algorithm’, proposed by Petrovic and Xydeas after slight modification. We observe a low information recovery due to losses suffered during the rank-order encoding and decoding processes. We propose to minimise these losses to recover maximum information in minimum time from rank-order encoded images. We first maximise information recovery by using the pseudo-inverse of the filter-bank matrix to minimise losses during rankorder decoding. We then apply the biological principle of lateral inhibition to minimise losses during rank-order encoding. In doing so, we propose the Filteroverlap Correction algorithm. To test the perfomance of rank-order codes in a biologically realistic model, we design and simulate a model of the foveal-pit ganglion cells of the retina keeping close to biological parameters. We use this as a rank-order encoder and analyse its performance relative to VanRullen and Thorpe’s retinal model

University of Lincoln Institutional Repository

Musical notes classification with Neuromorphic Auditory System using FPGA and a Convolutional Spiking Network

Author: Cerezuela Escudero Elena
Domínguez Morales Manuel Jesús
Jiménez Fernández Ángel Francisco
Jiménez Moreno Gabriel
Linares Barranco Alejandro
Paz Vicente Rafael
Publication venue: IEEE Computer Society
Publication date: 01/01/2015
Field of study

In this paper, we explore the capabilities of a sound classification system that combines both a novel FPGA cochlear model implementation and a bio-inspired technique based on a trained convolutional spiking network. The neuromorphic auditory system that is used in this work produces a form of representation that is analogous to the spike outputs of the biological cochlea. The auditory system has been developed using a set of spike-based processing building blocks in the frequency domain. They form a set of band pass filters in the spike-domain that splits the audio information in 128 frequency channels, 64 for each of two audio sources. Address Event Representation (AER) is used to communicate the auditory system with the convolutional spiking network. A layer of convolutional spiking network is developed and trained on a computer with the ability to detect two kinds of sound: artificial pure tones in the presence of white noise and electronic musical notes. After the training process, the presented system is able to distinguish the different sounds in real-time, even in the presence of white noise.Ministerio de Economía y Competitividad TEC2012-37868-C04-0

idUS. Depósito de Investigación Universidad de Sevilla