17,585 research outputs found
3D environment mapping using the Kinect V2 and path planning based on RRT algorithms
This paper describes a 3D path planning system that is able to provide a solution trajectory for the automatic control of a robot. The proposed system uses a point cloud obtained from the robot workspace, with a Kinect V2 sensor to identify the interest regions and the obstacles of the environment. Our proposal includes a collision-free path planner based on the Rapidly-exploring Random Trees variant (RRT*), for a safe and optimal navigation of robots in 3D spaces. Results on RGB-D segmentation and recognition, point cloud processing, and comparisons between different RRT* algorithms, are presented.Peer ReviewedPostprint (published version
Convolutional Color Constancy
Color constancy is the problem of inferring the color of the light that
illuminated a scene, usually so that the illumination color can be removed.
Because this problem is underconstrained, it is often solved by modeling the
statistical regularities of the colors of natural objects and illumination. In
contrast, in this paper we reformulate the problem of color constancy as a 2D
spatial localization task in a log-chrominance space, thereby allowing us to
apply techniques from object detection and structured prediction to the color
constancy problem. By directly learning how to discriminate between correctly
white-balanced images and poorly white-balanced images, our model is able to
improve performance on standard benchmarks by nearly 40%
A Framework for Symmetric Part Detection in Cluttered Scenes
The role of symmetry in computer vision has waxed and waned in importance
during the evolution of the field from its earliest days. At first figuring
prominently in support of bottom-up indexing, it fell out of favor as shape
gave way to appearance and recognition gave way to detection. With a strong
prior in the form of a target object, the role of the weaker priors offered by
perceptual grouping was greatly diminished. However, as the field returns to
the problem of recognition from a large database, the bottom-up recovery of the
parts that make up the objects in a cluttered scene is critical for their
recognition. The medial axis community has long exploited the ubiquitous
regularity of symmetry as a basis for the decomposition of a closed contour
into medial parts. However, today's recognition systems are faced with
cluttered scenes, and the assumption that a closed contour exists, i.e. that
figure-ground segmentation has been solved, renders much of the medial axis
community's work inapplicable. In this article, we review a computational
framework, previously reported in Lee et al. (2013), Levinshtein et al. (2009,
2013), that bridges the representation power of the medial axis and the need to
recover and group an object's parts in a cluttered scene. Our framework is
rooted in the idea that a maximally inscribed disc, the building block of a
medial axis, can be modeled as a compact superpixel in the image. We evaluate
the method on images of cluttered scenes.Comment: 10 pages, 8 figure
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Fast Graph-Based Object Segmentation for RGB-D Images
Object segmentation is an important capability for robotic systems, in
particular for grasping. We present a graph- based approach for the
segmentation of simple objects from RGB-D images. We are interested in
segmenting objects with large variety in appearance, from lack of texture to
strong textures, for the task of robotic grasping. The algorithm does not rely
on image features or machine learning. We propose a modified Canny edge
detector for extracting robust edges by using depth information and two simple
cost functions for combining color and depth cues. The cost functions are used
to build an undirected graph, which is partitioned using the concept of
internal and external differences between graph regions. The partitioning is
fast with O(NlogN) complexity. We also discuss ways to deal with missing depth
information. We test the approach on different publicly available RGB-D object
datasets, such as the Rutgers APC RGB-D dataset and the RGB-D Object Dataset,
and compare the results with other existing methods
Vision-Based Road Detection in Automotive Systems: A Real-Time Expectation-Driven Approach
The main aim of this work is the development of a vision-based road detection
system fast enough to cope with the difficult real-time constraints imposed by
moving vehicle applications. The hardware platform, a special-purpose massively
parallel system, has been chosen to minimize system production and operational
costs. This paper presents a novel approach to expectation-driven low-level
image segmentation, which can be mapped naturally onto mesh-connected massively
parallel SIMD architectures capable of handling hierarchical data structures.
The input image is assumed to contain a distorted version of a given template;
a multiresolution stretching process is used to reshape the original template
in accordance with the acquired image content, minimizing a potential function.
The distorted template is the process output.Comment: See http://www.jair.org/ for any accompanying file
- …