15,066 research outputs found
Tightly Coupled GNSS and Vision Navigation for Unmanned Air Vehicle Applications
This paper explores the unique benefits that can be obtained from a tight integration of a GNSS sensor and a forward-looking vision sensor. The motivation of this research is the belief that both GNSS and vision will be integral features of future UAV avionics architectures, GNSS for basic aircraft navigation and vision for obstacle-aircraft collision avoidance. The paper will show that utilising basic single-antenna GNSS measurements and observables, along with aircraft information derived from optical flow techniques creates unique synergies. Results of the accuracy of attitude estimates will be presented, based a comprehensive MatlabĀ® SimulinkĀ® model which re-creates an optical flow stream based on the flight of an aircraft. This paper establishes the viability of this novel integrated GNSS/Vision approach for use as the complete UAV sensor package, or as a backup sensor for an inertial navigation system
Asynchronous displays for multi-UV search tasks
Synchronous video has long been the preferred mode for controlling remote robots with other modes such as asynchronous control only used when unavoidable as in the case of interplanetary robotics. We identify two basic problems for controlling multiple robots using synchronous displays: operator overload and information fusion. Synchronous displays from multiple robots can easily overwhelm an operator who must search video for targets. If targets are plentiful, the operator will likely miss targets that enter and leave unattended views while dealing with others that were noticed. The related fusion problem arises because robots' multiple fields of view may overlap forcing the operator to reconcile different views from different perspectives and form an awareness of the environment by "piecing them together". We have conducted a series of experiments investigating the suitability of asynchronous displays for multi-UV search. Our first experiments involved static panoramas in which operators selected locations at which robots halted and panned their camera to capture a record of what could be seen from that location. A subsequent experiment investigated the hypothesis that the relative performance of the panoramic display would improve as the number of robots was increased causing greater overload and fusion problems. In a subsequent Image Queue system we used automated path planning and also automated the selection of imagery for presentation by choosing a greedy selection of non-overlapping views. A fourth set of experiments used the SUAVE display, an asynchronous variant of the picture-in-picture technique for video from multiple UAVs. The panoramic displays which addressed only the overload problem led to performance similar to synchronous video while the Image Queue and SUAVE displays which addressed fusion as well led to improved performance on a number of measures. In this paper we will review our experiences in designing and testing asynchronous displays and discuss challenges to their use including tracking dynamic targets. Ā© 2012 by the American Institute of Aeronautics and Astronautics, Inc
DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning
This paper presents a novel iterative deep learning framework and apply it
for document enhancement and binarization. Unlike the traditional methods which
predict the binary label of each pixel on the input image, we train the neural
network to learn the degradations in document images and produce the uniform
images of the degraded input images, which allows the network to refine the
output iteratively. Two different iterative methods have been studied in this
paper: recurrent refinement (RR) which uses the same trained neural network in
each iteration for document enhancement and stacked refinement (SR) which uses
a stack of different neural networks for iterative output refinement. Given the
learned uniform and enhanced image, the binarization map can be easy to obtain
by a global or local threshold. The experimental results on several public
benchmark data sets show that our proposed methods provide a new clean version
of the degraded image which is suitable for visualization and promising results
of binarization using the global Otsu's threshold based on the enhanced images
learned iteratively by the neural network.Comment: Accepted by Pattern Recognitio
Information recovery from rank-order encoded images
The time to detection of a visual stimulus by the primate eye is recorded at
100 ā 150ms. This near instantaneous recognition is in spite of the considerable
processing required by the several stages of the visual pathway to recognise and
react to a visual scene. How this is achieved is still a matter of speculation.
Rank-order codes have been proposed as a means of encoding by the primate
eye in the rapid transmission of the initial burst of information from the sensory
neurons to the brain. We study the efficiency of rank-order codes in encoding
perceptually-important information in an image. VanRullen and Thorpe built a
model of the ganglion cell layers of the retina to simulate and study the viability
of rank-order as a means of encoding by retinal neurons. We validate their model
and quantify the information retrieved from rank-order encoded images in terms
of the visually-important information recovered. Towards this goal, we apply
the āperceptual information preservation algorithmā, proposed by Petrovic and
Xydeas after slight modification. We observe a low information recovery due
to losses suffered during the rank-order encoding and decoding processes. We
propose to minimise these losses to recover maximum information in minimum
time from rank-order encoded images. We first maximise information recovery by
using the pseudo-inverse of the filter-bank matrix to minimise losses during rankorder
decoding. We then apply the biological principle of lateral inhibition to
minimise losses during rank-order encoding. In doing so, we propose the Filteroverlap
Correction algorithm. To test the perfomance of rank-order codes in
a biologically realistic model, we design and simulate a model of the foveal-pit
ganglion cells of the retina keeping close to biological parameters. We use this
as a rank-order encoder and analyse its performance relative to VanRullen and
Thorpeās retinal model
Musical notes classification with Neuromorphic Auditory System using FPGA and a Convolutional Spiking Network
In this paper, we explore the capabilities of a sound
classification system that combines both a novel FPGA cochlear
model implementation and a bio-inspired technique based on a
trained convolutional spiking network. The neuromorphic
auditory system that is used in this work produces a form of
representation that is analogous to the spike outputs of the
biological cochlea. The auditory system has been developed using
a set of spike-based processing building blocks in the frequency
domain. They form a set of band pass filters in the spike-domain
that splits the audio information in 128 frequency channels, 64
for each of two audio sources. Address Event Representation
(AER) is used to communicate the auditory system with the
convolutional spiking network. A layer of convolutional spiking
network is developed and trained on a computer with the ability
to detect two kinds of sound: artificial pure tones in the presence
of white noise and electronic musical notes. After the training
process, the presented system is able to distinguish the different
sounds in real-time, even in the presence of white noise.Ministerio de EconomĆa y Competitividad TEC2012-37868-C04-0
- ā¦