308 research outputs found
Convolutional Patch Networks with Spatial Prior for Road Detection and Urban Scene Understanding
Classifying single image patches is important in many different applications,
such as road detection or scene understanding. In this paper, we present
convolutional patch networks, which are convolutional networks learned to
distinguish different image patches and which can be used for pixel-wise
labeling. We also show how to incorporate spatial information of the patch as
an input to the network, which allows for learning spatial priors for certain
categories jointly with an appearance model. In particular, we focus on road
detection and urban scene understanding, two application areas where we are
able to achieve state-of-the-art results on the KITTI as well as on the
LabelMeFacade dataset.
Furthermore, our paper offers a guideline for people working in the area and
desperately wandering through all the painstaking details that render training
CNs on image patches extremely difficult.Comment: VISAPP 2015 pape
Non-parametric spatially constrained local prior for scene parsing on real-world data
Scene parsing aims to recognize the object category of every pixel in scene
images, and it plays a central role in image content understanding and computer
vision applications. However, accurate scene parsing from unconstrained
real-world data is still a challenging task. In this paper, we present the
non-parametric Spatially Constrained Local Prior (SCLP) for scene parsing on
realistic data. For a given query image, the non-parametric SCLP is learnt by
first retrieving a subset of most similar training images to the query image
and then collecting prior information about object co-occurrence statistics
between spatial image blocks and between adjacent superpixels from the
retrieved subset. The SCLP is powerful in capturing both long- and short-range
context about inter-object correlations in the query image and can be
effectively integrated with traditional visual features to refine the
classification results. Our experiments on the SIFT Flow and PASCAL-Context
benchmark datasets show that the non-parametric SCLP used in conjunction with
superpixel-level visual features achieves one of the top performance compared
with state-of-the-art approaches.Comment: 10 pages, journa
Release of cognitive and multimodal MRI data including real-world tasks and hippocampal subfield segmentations
We share data from N = 217 healthy adults (mean age 29 years, range 20-41; 109 females, 108 males) who underwent extensive cognitive assessment and neuroimaging to examine the neural basis of individual differences, with a particular focus on a brain structure called the hippocampus. Cognitive data were collected using a wide array of questionnaires, naturalistic tests that examined imagination, autobiographical memory recall and spatial navigation, traditional laboratory-based tests such as recalling word pairs, and comprehensive characterisation of the strategies used to perform the cognitive tests. 3 Tesla MRI data were also acquired and include multi-parameter mapping to examine tissue microstructure, diffusion-weighted MRI, T2-weighted high-resolution partial volume structural MRI scans (with the masks of hippocampal subfields manually segmented from these scans), whole brain resting state functional MRI scans and partial volume high resolution resting state functional MRI scans. This rich dataset will be of value to cognitive and clinical neuroscientists researching individual differences, real-world cognition, brain-behaviour associations, hippocampal subfields and more. All data are freely available on Dryad
SkipcrossNets: Adaptive Skip-cross Fusion for Road Detection
Multi-modal fusion is increasingly being used for autonomous driving tasks,
as images from different modalities provide unique information for feature
extraction. However, the existing two-stream networks are only fused at a
specific network layer, which requires a lot of manual attempts to set up. As
the CNN goes deeper, the two modal features become more and more advanced and
abstract, and the fusion occurs at the feature level with a large gap, which
can easily hurt the performance. In this study, we propose a novel fusion
architecture called skip-cross networks (SkipcrossNets), which combines
adaptively LiDAR point clouds and camera images without being bound to a
certain fusion epoch. Specifically, skip-cross connects each layer to each
layer in a feed-forward manner, and for each layer, the feature maps of all
previous layers are used as input and its own feature maps are used as input to
all subsequent layers for the other modality, enhancing feature propagation and
multi-modal features fusion. This strategy facilitates selection of the most
similar feature layers from two data pipelines, providing a complementary
effect for sparse point cloud features during fusion processes. The network is
also divided into several blocks to reduce the complexity of feature fusion and
the number of model parameters. The advantages of skip-cross fusion were
demonstrated through application to the KITTI and A2D2 datasets, achieving a
MaxF score of 96.85% on KITTI and an F1 score of 84.84% on A2D2. The model
parameters required only 2.33 MB of memory at a speed of 68.24 FPS, which could
be viable for mobile terminals and embedded devices
- …