1,843 research outputs found
Efficient Semantic Segmentation via Self-Attention and Self-Distillation
Lightweight models are pivotal in efficient semantic segmentation, but they often suffer from insufficient context information due to limited convolution and small receptive field. To address this problem, we propose a tailored approach to efficient semantic segmentation by leveraging two complementary distillation schemes for supplementing context information to small networks: 1) a self-attention distillation scheme, which transfers long-range context knowledge adaptively from large teacher networks to small student networks; and 2) a layer-wise context distillation scheme, which transfers structured context from deep layers to shallow layers within student networks for promoting semantic consistency of the shallow layers. Extensive experiments on the ADE20K, Cityscapes, and Camvid datasets well demonstrate the effectiveness of our proposal
Memory Structure and Cognitive Maps
A common way to understand memory structures in the cognitive sciences is as a cognitive map​.
Cognitive maps are representational systems organized by dimensions shared with physical space. The
appeal to these maps begins literally: as an account of how spatial information is represented and used
to inform spatial navigation. Invocations of cognitive maps, however, are often more ambitious;
cognitive maps are meant to scale up and provide the basis for our more sophisticated memory
capacities. The extension is not meant to be metaphorical, but the way in which these richer mental
structures are supposed to remain map-like is rarely made explicit. Here we investigate this missing
link, asking: how do cognitive maps represent non-spatial information?​ We begin with a survey of
foundational work on spatial cognitive maps and then provide a comparative review of alternative,
non-spatial representational structures. We then turn to several cutting-edge projects that are engaged
in the task of scaling up cognitive maps so as to accommodate non-spatial information: first, on the
spatial-isometric approach​ , encoding content that is non-spatial but in some sense isomorphic to
spatial content; second, on the ​ abstraction approach​ , encoding content that is an abstraction over
first-order spatial information; and third, on the ​ embedding approach​ , embedding non-spatial
information within a spatial context, a prominent example being the Method-of-Loci. Putting these
cases alongside one another reveals the variety of options available for building cognitive maps, and the
distinctive limitations of each. We conclude by reflecting on where these results take us in terms of
understanding the place of cognitive maps in memory
Benchmarking Deep Learning Architectures for Urban Vegetation Points Segmentation
Vegetation is crucial for sustainable and resilient cities providing various
ecosystem services and well-being of humans. However, vegetation is under
critical stress with rapid urbanization and expanding infrastructure
footprints. Consequently, mapping of this vegetation is essential in the urban
environment. Recently, deep learning for point cloud semantic segmentation has
shown significant progress. Advanced models attempt to obtain state-of-the-art
performance on benchmark datasets, comprising multiple classes and representing
real world scenarios. However, class specific segmentation with respect to
vegetation points has not been explored. Therefore, selection of a deep
learning model for vegetation points segmentation is ambiguous. To address this
problem, we provide a comprehensive assessment of point-based deep learning
models for semantic segmentation of vegetation class. We have selected four
representative point-based models, namely PointCNN, KPConv (omni-supervised),
RandLANet and SCFNet. These models are investigated on three different
datasets, specifically Chandigarh, Toronto3D and Kerala, which are
characterized by diverse nature of vegetation, varying scene complexity and
changing per-point features. PointCNN achieves the highest mIoU on the
Chandigarh (93.32%) and Kerala datasets (85.68%) while KPConv (omni-supervised)
provides the highest mIoU on the Toronto3D dataset (91.26%). The paper develops
a deeper insight, hitherto not reported, into the working of these models for
vegetation segmentation and outlines the ingredients that should be included in
a model specifically for vegetation segmentation. This paper is a step towards
the development of a novel architecture for vegetation points segmentation.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Weighted Point Cloud Augmentation for Neural Network Training Data Class-Imbalance
Recent developments in the field of deep learning for 3D data have
demonstrated promising potential for end-to-end learning directly from point
clouds. However, many real-world point clouds contain a large class im-balance
due to the natural class im-balance observed in nature. For example, a 3D scan
of an urban environment will consist mostly of road and facade, whereas other
objects such as poles will be under-represented. In this paper we address this
issue by employing a weighted augmentation to increase classes that contain
fewer points. By mitigating the class im-balance present in the data we
demonstrate that a standard PointNet++ deep neural network can achieve higher
performance at inference on validation data. This was observed as an increase
of F1 score of 19% and 25% on two test benchmark datasets; ScanNet and
Semantic3D respectively where no class im-balance pre-processing had been
performed. Our networks performed better on both highly-represented and
under-represented classes, which indicates that the network is learning more
robust and meaningful features when the loss function is not overly exposed to
only a few classes.Comment: 7 pages, 6 figures, submitted for ISPRS Geospatial Week conference
201
Using layer-wise training for Road Semantic Segmentation in Autonomous Cars
A recently developed application of computer vision is pathfinding in self-driving cars. Semantic scene understanding and semantic segmentation, as subfields of computer vision, are widely used in autonomous driving. Semantic segmentation for pathfinding uses deep learning methods and various large sample datasets to train a proper model. Due to the importance of this task, accurate and robust models should be trained to perform properly in different lighting and weather conditions and in the presence of noisy input data. In this paper, we propose a novel learning method for semantic segmentation called layer-wise training and evaluate it on a light efficient structure called an efficient neural network (ENet). The results of the proposed learning method are compared with the classic learning approaches, including mIoU performance, network robustness to noise, and the possibility of reducing the size of the structure on two RGB image datasets on the road (CamVid) and off-road (Freiburg Forest) paths. Using this method partially eliminates the need for Transfer Learning. It also improves network performance when input is noisy
- …