28 research outputs found
Determination of parking space and its concurrent usage over time using semantically segmented mobile mapping data
Public space is a scarce good in cities. There are many concurrent usages, which makes an adequate allocation of space both difficult and highly attractive. A lot of space is allocated by parking cars - even if the parking spaces are not occupied by cars all the time. In this work, we analyze space demand and usage by parking cars, in order to evaluate, when this space could be used for other purposes. The analysis is based on 3D point clouds acquired at several times during a day. We propose a processing pipeline to extract car bounding boxes from a given 3D point cloud. For the car extraction we utilize a label transfer technique for transfers from semantically segmented 2D RGB images to 3D point cloud data. This semantically segmented 3D data allows us to identify car instances. Subsequently, we aggregate and analyze information about parking cars. We present an exemplary analysis of the urban area where we extracted 15.000 cars at five different points in time. Based on this aggregated we present analytical results for time dependent parking behavior, parking space availability and utilization
Real-time Semantic Segmentation with Context Aggregation Network
With the increasing demand of autonomous systems, pixelwise semantic
segmentation for visual scene understanding needs to be not only accurate but
also efficient for potential real-time applications. In this paper, we propose
Context Aggregation Network, a dual branch convolutional neural network, with
significantly lower computational costs as compared to the state-of-the-art,
while maintaining a competitive prediction accuracy. Building upon the existing
dual branch architectures for high-speed semantic segmentation, we design a
cheap high resolution branch for effective spatial detailing and a context
branch with light-weight versions of global aggregation and local distribution
blocks, potent to capture both long-range and local contextual dependencies
required for accurate semantic segmentation, with low computational overheads.
We evaluate our method on two semantic segmentation datasets, namely Cityscapes
dataset and UAVid dataset. For Cityscapes test set, our model achieves
state-of-the-art results with mIOU of 75.9%, at 76 FPS on an NVIDIA RTX 2080Ti
and 8 FPS on a Jetson Xavier NX. With regards to UAVid dataset, our proposed
network achieves mIOU score of 63.5% with high execution speed (15 FPS).Comment: extended version of v
Using ROC and Unlabeled Data for Increasing Low-Shot Transfer Learning Classification Accuracy
One of the most important characteristics of human visual intelligence is the
ability to identify unknown objects. The capability to distinguish between a
substance which a human mind has no previous experience of and a familiar
object, is innate to every human. In everyday life, within seconds of seeing an
"unknown" object, we are able to categorize it as such without any substantial
effort. Convolutional Neural Networks, regardless of how they are trained (i.e.
in a conventional manner or through transfer learning) can recognize only the
classes that they are trained for. When using them for classification, any
candidate image will be placed in one of the available classes. We propose a
low-shot classifier which can serve as the top layer to any existing CNN that
the feature extractor was already trained. Using a limited amount of labeled
data for the type of images which need to be specifically classified along with
unlabeled data for all other images, a unique target matrix and a Receiver
Operator Curve (ROC) criterion, we are able to increase identification accuracy
by up to 30% for the images that do not belong to any specific classes, while
retaining the ability to identify images that belong to the specific classes of
interest
HALSIE - Hybrid Approach to Learning Segmentation by Simultaneously Exploiting Image and Event Modalities
Standard frame-based algorithms fail to retrieve accurate segmentation maps
in challenging real-time applications like autonomous navigation, owing to the
limited dynamic range and motion blur prevalent in traditional cameras. Event
cameras address these limitations by asynchronously detecting changes in
per-pixel intensity to generate event streams with high temporal resolution,
high dynamic range, and no motion blur. However, event camera outputs cannot be
directly used to generate reliable segmentation maps as they only capture
information at the pixels in motion. To augment the missing contextual
information, we postulate that fusing spatially dense frames with temporally
dense events can generate semantic maps with fine-grained predictions. To this
end, we propose HALSIE, a hybrid approach to learning segmentation by
simultaneously leveraging image and event modalities. To enable efficient
learning across modalities, our proposed hybrid framework comprises two input
branches, a Spiking Neural Network (SNN) branch and a standard Artificial
Neural Network (ANN) branch to process event and frame data respectively, while
exploiting their corresponding neural dynamics. Our hybrid network outperforms
the state-of-the-art semantic segmentation benchmarks on DDD17 and MVSEC
datasets and shows comparable performance on the DSEC-Semantic dataset with
upto 33.23 reduction in network parameters. Further, our method shows
upto 18.92 improvement in inference cost compared to existing SOTA
approaches, making it suitable for resource-constrained edge applications