29,702 research outputs found
DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments
Simultaneous Localization and Mapping (SLAM) is considered to be a
fundamental capability for intelligent mobile robots. Over the past decades,
many impressed SLAM systems have been developed and achieved good performance
under certain circumstances. However, some problems are still not well solved,
for example, how to tackle the moving objects in the dynamic environments, how
to make the robots truly understand the surroundings and accomplish advanced
tasks. In this paper, a robust semantic visual SLAM towards dynamic
environments named DS-SLAM is proposed. Five threads run in parallel in
DS-SLAM: tracking, semantic segmentation, local mapping, loop closing, and
dense semantic map creation. DS-SLAM combines semantic segmentation network
with moving consistency check method to reduce the impact of dynamic objects,
and thus the localization accuracy is highly improved in dynamic environments.
Meanwhile, a dense semantic octo-tree map is produced, which could be employed
for high-level tasks. We conduct experiments both on TUM RGB-D dataset and in
the real-world environment. The results demonstrate the absolute trajectory
accuracy in DS-SLAM can be improved by one order of magnitude compared with
ORB-SLAM2. It is one of the state-of-the-art SLAM systems in high-dynamic
environments. Now the code is available at our github:
https://github.com/ivipsourcecode/DS-SLAMComment: 7 pages, accepted at the 2018 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS 2018). Now the code is available at our
github: https://github.com/ivipsourcecode/DS-SLA
Segmentation-Aware Convolutional Networks Using Local Attention Masks
We introduce an approach to integrate segmentation information within a
convolutional neural network (CNN). This counter-acts the tendency of CNNs to
smooth information across regions and increases their spatial precision. To
obtain segmentation information, we set up a CNN to provide an embedding space
where region co-membership can be estimated based on Euclidean distance. We use
these embeddings to compute a local attention mask relative to every neuron
position. We incorporate such masks in CNNs and replace the convolution
operation with a "segmentation-aware" variant that allows a neuron to
selectively attend to inputs coming from its own region. We call the resulting
network a segmentation-aware CNN because it adapts its filters at each image
point according to local segmentation cues. We demonstrate the merit of our
method on two widely different dense prediction tasks, that involve
classification (semantic segmentation) and regression (optical flow). Our
results show that in semantic segmentation we can match the performance of
DenseCRFs while being faster and simpler, and in optical flow we obtain clearly
sharper responses than networks that do not use local attention masks. In both
cases, segmentation-aware convolution yields systematic improvements over
strong baselines. Source code for this work is available online at
http://cs.cmu.edu/~aharley/segaware
Multi-Scale 3D Scene Flow from Binocular Stereo Sequences
Scene ïŹow methods estimate the three-dimensional motion ïŹeld for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene ïŹow estimation that provides reliable results using only two cameras by fusing stereo and optical ïŹow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical ïŹow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene ïŹow than previous methods allow. To handle the aperture problems inherent in the estimation of optical ïŹow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization â two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108
A multi-view approach to cDNA micro-array analysis
The official published version can be obtained from the link below.Microarray has emerged as a powerful technology that enables biologists to study thousands of genes simultaneously, therefore, to obtain a better understanding of the gene interaction and regulation mechanisms. This paper is concerned with improving the processes involved in the analysis of microarray image data. The main focus is to clarify an image's feature space in an unsupervised manner. In this paper, the Image Transformation Engine (ITE), combined with different filters, is investigated. The proposed methods are applied to a set of real-world cDNA images. The MatCNN toolbox is used during the segmentation process. Quantitative comparisons between different filters are carried out. It is shown that the CLD filter is the best one to be applied with the ITE.This work was supported in part by the Engineering and Physical Sciences Research
Council (EPSRC) of the UK under Grant GR/S27658/01, the National Science Foundation of China under Innovative Grant 70621001, Chinese Academy of Sciences
under Innovative Group Overseas Partnership Grant, the BHP Billiton Cooperation of Australia Grant, the International Science and Technology Cooperation Project of China
under Grant 2009DFA32050 and the Alexander von Humboldt Foundation of Germany
Efficient SDP Inference for Fully-connected CRFs Based on Low-rank Decomposition
Conditional Random Fields (CRF) have been widely used in a variety of
computer vision tasks. Conventional CRFs typically define edges on neighboring
image pixels, resulting in a sparse graph such that efficient inference can be
performed. However, these CRFs fail to model long-range contextual
relationships. Fully-connected CRFs have thus been proposed. While there are
efficient approximate inference methods for such CRFs, usually they are
sensitive to initialization and make strong assumptions. In this work, we
develop an efficient, yet general algorithm for inference on fully-connected
CRFs. The algorithm is based on a scalable SDP algorithm and the low- rank
approximation of the similarity/kernel matrix. The core of the proposed
algorithm is a tailored quasi-Newton method that takes advantage of the
low-rank matrix approximation when solving the specialized SDP dual problem.
Experiments demonstrate that our method can be applied on fully-connected CRFs
that cannot be solved previously, such as pixel-level image co-segmentation.Comment: 15 pages. A conference version of this work appears in Proc. IEEE
Conference on Computer Vision and Pattern Recognition, 201
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
SegICP: Integrated Deep Semantic Segmentation and Pose Estimation
Recent robotic manipulation competitions have highlighted that sophisticated
robots still struggle to achieve fast and reliable perception of task-relevant
objects in complex, realistic scenarios. To improve these systems' perceptive
speed and robustness, we present SegICP, a novel integrated solution to object
recognition and pose estimation. SegICP couples convolutional neural networks
and multi-hypothesis point cloud registration to achieve both robust pixel-wise
semantic segmentation as well as accurate and real-time 6-DOF pose estimation
for relevant objects. Our architecture achieves 1cm position error and
<5^\circ$ angle error in real time without an initial seed. We evaluate and
benchmark SegICP against an annotated dataset generated by motion capture.Comment: IROS camera-read
Automated detection of extended sources in radio maps: progress from the SCORPIO survey
Automated source extraction and parameterization represents a crucial
challenge for the next-generation radio interferometer surveys, such as those
performed with the Square Kilometre Array (SKA) and its precursors. In this
paper we present a new algorithm, dubbed CAESAR (Compact And Extended Source
Automated Recognition), to detect and parametrize extended sources in radio
interferometric maps. It is based on a pre-filtering stage, allowing image
denoising, compact source suppression and enhancement of diffuse emission,
followed by an adaptive superpixel clustering stage for final source
segmentation. A parameterization stage provides source flux information and a
wide range of morphology estimators for post-processing analysis. We developed
CAESAR in a modular software library, including also different methods for
local background estimation and image filtering, along with alternative
algorithms for both compact and diffuse source extraction. The method was
applied to real radio continuum data collected at the Australian Telescope
Compact Array (ATCA) within the SCORPIO project, a pathfinder of the ASKAP-EMU
survey. The source reconstruction capabilities were studied over different test
fields in the presence of compact sources, imaging artefacts and diffuse
emission from the Galactic plane and compared with existing algorithms. When
compared to a human-driven analysis, the designed algorithm was found capable
of detecting known target sources and regions of diffuse emission,
outperforming alternative approaches over the considered fields.Comment: 15 pages, 9 figure
- âŠ