5,202 research outputs found
Fast LIDAR-based Road Detection Using Fully Convolutional Neural Networks
In this work, a deep learning approach has been developed to carry out road
detection using only LIDAR data. Starting from an unstructured point cloud,
top-view images encoding several basic statistics such as mean elevation and
density are generated. By considering a top-view representation, road detection
is reduced to a single-scale problem that can be addressed with a simple and
fast fully convolutional neural network (FCN). The FCN is specifically designed
for the task of pixel-wise semantic segmentation by combining a large receptive
field with high-resolution feature maps. The proposed system achieved excellent
performance and it is among the top-performing algorithms on the KITTI road
benchmark. Its fast inference makes it particularly suitable for real-time
applications
Multi-Modal Trip Hazard Affordance Detection On Construction Sites
Trip hazards are a significant contributor to accidents on construction and
manufacturing sites, where over a third of Australian workplace injuries occur
[1]. Current safety inspections are labour intensive and limited by human
fallibility,making automation of trip hazard detection appealing from both a
safety and economic perspective. Trip hazards present an interesting challenge
to modern learning techniques because they are defined as much by affordance as
by object type; for example wires on a table are not a trip hazard, but can be
if lying on the ground. To address these challenges, we conduct a comprehensive
investigation into the performance characteristics of 11 different colour and
depth fusion approaches, including 4 fusion and one non fusion approach; using
colour and two types of depth images. Trained and tested on over 600 labelled
trip hazards over 4 floors and 2000m in an active construction
site,this approach was able to differentiate between identical objects in
different physical configurations (see Figure 1). Outperforming a colour-only
detector, our multi-modal trip detector fuses colour and depth information to
achieve a 4% absolute improvement in F1-score. These investigative results and
the extensive publicly available dataset moves us one step closer to assistive
or fully automated safety inspection systems on construction sites.Comment: 9 Pages, 12 Figures, 2 Tables, Accepted to Robotics and Automation
Letters (RA-L
Aerial Vehicle Tracking by Adaptive Fusion of Hyperspectral Likelihood Maps
Hyperspectral cameras can provide unique spectral signatures for consistently
distinguishing materials that can be used to solve surveillance tasks. In this
paper, we propose a novel real-time hyperspectral likelihood maps-aided
tracking method (HLT) inspired by an adaptive hyperspectral sensor. A moving
object tracking system generally consists of registration, object detection,
and tracking modules. We focus on the target detection part and remove the
necessity to build any offline classifiers and tune a large amount of
hyperparameters, instead learning a generative target model in an online manner
for hyperspectral channels ranging from visible to infrared wavelengths. The
key idea is that, our adaptive fusion method can combine likelihood maps from
multiple bands of hyperspectral imagery into one single more distinctive
representation increasing the margin between mean value of foreground and
background pixels in the fused map. Experimental results show that the HLT not
only outperforms all established fusion methods but is on par with the current
state-of-the-art hyperspectral target tracking frameworks.Comment: Accepted at the International Conference on Computer Vision and
Pattern Recognition Workshops, 201
K-Space at TRECVid 2007
In this paper we describe K-Space participation in
TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance.
The first of the two systems was a āshotā based interface,
where the results from a query were presented as a ranked
list of shots. The second interface was ābroadcastā based,
where results were presented as a ranked list of broadcasts.
Both systems made use of the outputs of our high-level feature submission as well as low-level visual features
- ā¦