24,724 research outputs found
Hierarchical Salient Object Detection for Assisted Grasping
Visual scene decomposition into semantic entities is one of the major
challenges when creating a reliable object grasping system. Recently, we
introduced a bottom-up hierarchical clustering approach which is able to
segment objects and parts in a scene. In this paper, we introduce a transform
from such a segmentation into a corresponding, hierarchical saliency function.
In comprehensive experiments we demonstrate its ability to detect salient
objects in a scene. Furthermore, this hierarchical saliency defines a most
salient corresponding region (scale) for every point in an image. Based on
this, an easy-to-use pick and place manipulation system was developed and
tested exemplarily.Comment: Accepted for ICRA 201
Centering in-the-large: Computing referential discourse segments
We specify an algorithm that builds up a hierarchy of referential discourse
segments from local centering data. The spatial extension and nesting of these
discourse segments constrain the reachability of potential antecedents of an
anaphoric expression beyond the local level of adjacent center pairs. Thus, the
centering model is scaled up to the level of the global referential structure
of discourse. An empirical evaluation of the algorithm is supplied.Comment: LaTeX, 8 page
A Deep Learning Approach to Denoise Optical Coherence Tomography Images of the Optic Nerve Head
Purpose: To develop a deep learning approach to de-noise optical coherence
tomography (OCT) B-scans of the optic nerve head (ONH).
Methods: Volume scans consisting of 97 horizontal B-scans were acquired
through the center of the ONH using a commercial OCT device (Spectralis) for
both eyes of 20 subjects. For each eye, single-frame (without signal
averaging), and multi-frame (75x signal averaging) volume scans were obtained.
A custom deep learning network was then designed and trained with 2,328 "clean
B-scans" (multi-frame B-scans), and their corresponding "noisy B-scans" (clean
B-scans + gaussian noise) to de-noise the single-frame B-scans. The performance
of the de-noising algorithm was assessed qualitatively, and quantitatively on
1,552 B-scans using the signal to noise ratio (SNR), contrast to noise ratio
(CNR), and mean structural similarity index metrics (MSSIM).
Results: The proposed algorithm successfully denoised unseen single-frame OCT
B-scans. The denoised B-scans were qualitatively similar to their corresponding
multi-frame B-scans, with enhanced visibility of the ONH tissues. The mean SNR
increased from dB (single-frame) to dB
(denoised). For all the ONH tissues, the mean CNR increased from (single-frame) to (denoised). The MSSIM increased from
(single frame) to (denoised) when compared with
the corresponding multi-frame B-scans.
Conclusions: Our deep learning algorithm can denoise a single-frame OCT
B-scan of the ONH in under 20 ms, thus offering a framework to obtain superior
quality OCT B-scans with reduced scanning times and minimal patient discomfort
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
- …