15,222 research outputs found
Separating true range measurements from multi-path and scattering interference in commercial range cameras
Time-of-flight range cameras acquire a three-dimensional image of a scene simultaneously for all pixels from a single viewing location. Attempts to use range cameras for metrology applications have been hampered by the multi-path problem, which causes range distortions when stray light interferes with the range measurement in a given pixel. Correcting multi-path distortions by post-processing the three-dimensional measurement data has been investigated, but enjoys limited success because the interference is highly scene dependent. An alternative approach based on separating the strongest and weaker sources of light returned to each pixel, prior to range decoding, is more successful, but has only been demonstrated on custom built range cameras, and has not been suitable for general metrology applications. In this paper we demonstrate an algorithm applied to both the Mesa Imaging SR-4000 and Canesta Inc. XZ-422 Demonstrator unmodified off-the-shelf range cameras. Additional raw images are acquired and processed using an optimization approach, rather than relying on the processing provided by the manufacturer, to determine the individual component returns in each pixel. Substantial improvements in accuracy are observed, especially in the darker regions of the scene
External localization system for mobile robotics
We present a fast and precise vision-based software intended for multiple robot localization. The core component of
the proposed localization system is an efficient method for black and white circular pattern detection. The method is robust to variable lighting conditions, achieves sub-pixel precision, and its computational complexity is independent of the processed image size. With off-the-shelf computational equipment and low-cost camera, its core algorithm is able to process hundreds of images per second while tracking hundreds of objects with millimeter precision. We propose a mathematical model of the method that allows to calculate its precision, area of coverage, and processing speed from the cameraâs intrinsic parameters and hardwareâs processing capacity. The correctness of the presented model and
performance of the algorithm in real-world conditions are verified in several experiments. Apart from the method description, we also publish its source code; so, it can be used as an enabling technology for various mobile robotics problems
A practical multirobot localization system
We present a fast and precise vision-based software intended for multiple robot localization. The core component of the software is a novel and efficient algorithm for black and white pattern detection. The method is robust to variable lighting conditions, achieves sub-pixel precision and its computational complexity is independent of the processed image size. With off-the-shelf computational equipment and low-cost cameras, the core algorithm is able to process hundreds of images per second while tracking hundreds of objects with a millimeter precision. In addition, we present the method's mathematical model, which allows to estimate the expected localization precision, area of coverage, and processing speed from the camera's intrinsic parameters and hardware's processing capacity. The correctness of the presented model and performance of the algorithm in real-world conditions is verified in several experiments. Apart from the method description, we also make its source code public at \emph{http://purl.org/robotics/whycon}; so, it can be used as an enabling technology for various mobile robotic problems
On the AER Stereo-Vision Processing: A Spike Approach to Epipolar Matching
Image processing in digital computer systems usually considers
visual information as a sequence of frames. These frames are from cameras that
capture reality for a short period of time. They are renewed and transmitted at a
rate of 25-30 fps (typical real-time scenario). Digital video processing has to
process each frame in order to detect a feature on the input. In stereo vision,
existing algorithms use frames from two digital cameras and process them pixel
by pixel until it finds a pattern match in a section of both stereo frames. To
process stereo vision information, an image matching process is essential, but it
needs very high computational cost. Moreover, as more information is
processed, the more time spent by the matching algorithm, the more inefficient
it is. Spike-based processing is a relatively new approach that implements
processing by manipulating spikes one by one at the time they are transmitted,
like a human brain. The mammal nervous system is able to solve much more
complex problems, such as visual recognition by manipulating neuronâs spikes.
The spike-based philosophy for visual information processing based on the
neuro-inspired Address-Event- Representation (AER) is achieving nowadays
very high performances. The aim of this work is to study the viability of a
matching mechanism in a stereo-vision system, using AER codification. This
kind of mechanism has not been done before to an AER system. To do that,
epipolar geometry basis applied to AER system are studied, and several tests
are run, using recorded data and a computer. The results and an average error
are shown (error less than 2 pixels per point); and the viability is proved
End-to-End Learning of Representations for Asynchronous Event-Based Data
Event cameras are vision sensors that record asynchronous streams of
per-pixel brightness changes, referred to as "events". They have appealing
advantages over frame-based cameras for computer vision, including high
temporal resolution, high dynamic range, and no motion blur. Due to the sparse,
non-uniform spatiotemporal layout of the event signal, pattern recognition
algorithms typically aggregate events into a grid-based representation and
subsequently process it by a standard vision pipeline, e.g., Convolutional
Neural Network (CNN). In this work, we introduce a general framework to convert
event streams into grid-based representations through a sequence of
differentiable operations. Our framework comes with two main advantages: (i)
allows learning the input event representation together with the task dedicated
network in an end to end manner, and (ii) lays out a taxonomy that unifies the
majority of extant event representations in the literature and identifies novel
ones. Empirically, we show that our approach to learning the event
representation end-to-end yields an improvement of approximately 12% on optical
flow estimation and object recognition over state-of-the-art methods.Comment: To appear at ICCV 201
- âŠ