114,889 research outputs found
Learning to See the Wood for the Trees: Deep Laser Localization in Urban and Natural Environments on a CPU
Localization in challenging, natural environments such as forests or
woodlands is an important capability for many applications from guiding a robot
navigating along a forest trail to monitoring vegetation growth with handheld
sensors. In this work we explore laser-based localization in both urban and
natural environments, which is suitable for online applications. We propose a
deep learning approach capable of learning meaningful descriptors directly from
3D point clouds by comparing triplets (anchor, positive and negative examples).
The approach learns a feature space representation for a set of segmented point
clouds that are matched between a current and previous observations. Our
learning method is tailored towards loop closure detection resulting in a small
model which can be deployed using only a CPU. The proposed learning method
would allow the full pipeline to run on robots with limited computational
payload such as drones, quadrupeds or UGVs.Comment: Accepted for publication at RA-L/ICRA 2019. More info:
https://ori.ox.ac.uk/esm-localizatio
Image-Based Sorghum Head Counting When You Only Look Once
Modern trends in digital agriculture have seen a shift towards artificial intelligence for crop quality assessment and yield estimation. In this work, we document how a parameter tuned single-shot object detection algorithm can be used to identify and count sorghum heads from aerial drone images. Our approach involves a novel exploratory analysis that identified key structural elements of the sorghum images and motivated the selection of parameter-tuned anchor boxes that contributed significantly to performance. These insights led to the development of a deep learning model that outperformed the baseline model and achieved an out-of-sample mean average precision of 0.95
Temporal HeartNet: Towards Human-Level Automatic Analysis of Fetal Cardiac Screening Video
We present an automatic method to describe clinically useful information
about scanning, and to guide image interpretation in ultrasound (US) videos of
the fetal heart. Our method is able to jointly predict the visibility, viewing
plane, location and orientation of the fetal heart at the frame level. The
contributions of the paper are three-fold: (i) a convolutional neural network
architecture is developed for a multi-task prediction, which is computed by
sliding a 3x3 window spatially through convolutional maps. (ii) an anchor
mechanism and Intersection over Union (IoU) loss are applied for improving
localization accuracy. (iii) a recurrent architecture is designed to
recursively compute regional convolutional features temporally over sequential
frames, allowing each prediction to be conditioned on the whole video. This
results in a spatial-temporal model that precisely describes detailed heart
parameters in challenging US videos. We report results on a real-world clinical
dataset, where our method achieves performance on par with expert annotations.Comment: To appear in MICCAI, 201
Support Neighbor Loss for Person Re-Identification
Person re-identification (re-ID) has recently been tremendously boosted due
to the advancement of deep convolutional neural networks (CNN). The majority of
deep re-ID methods focus on designing new CNN architectures, while less
attention is paid on investigating the loss functions. Verification loss and
identification loss are two types of losses widely used to train various deep
re-ID models, both of which however have limitations. Verification loss guides
the networks to generate feature embeddings of which the intra-class variance
is decreased while the inter-class ones is enlarged. However, training networks
with verification loss tends to be of slow convergence and unstable performance
when the number of training samples is large. On the other hand, identification
loss has good separating and scalable property. But its neglect to explicitly
reduce the intra-class variance limits its performance on re-ID, because the
same person may have significant appearance disparity across different camera
views. To avoid the limitations of the two types of losses, we propose a new
loss, called support neighbor (SN) loss. Rather than being derived from data
sample pairs or triplets, SN loss is calculated based on the positive and
negative support neighbor sets of each anchor sample, which contain more
valuable contextual information and neighborhood structure that are beneficial
for more stable performance. To ensure scalability and separability, a
softmax-like function is formulated to push apart the positive and negative
support sets. To reduce intra-class variance, the distance between the anchor's
nearest positive neighbor and furthest positive sample is penalized.
Integrating SN loss on top of Resnet50, superior re-ID results to the
state-of-the-art ones are obtained on several widely used datasets.Comment: Accepted by ACM Multimedia (ACM MM) 201
- …