130,437 research outputs found
MapFormer: Boosting Change Detection by Using Pre-change Information
Change detection in remote sensing imagery is essential for a variety of
applications such as urban planning, disaster management, and climate research.
However, existing methods for identifying semantically changed areas overlook
the availability of semantic information in the form of existing maps
describing features of the earth's surface. In this paper, we leverage this
information for change detection in bi-temporal images. We show that the simple
integration of the additional information via concatenation of latent
representations suffices to significantly outperform state-of-the-art change
detection methods. Motivated by this observation, we propose the new task of
Conditional Change Detection, where pre-change semantic information is used as
input next to bi-temporal images. To fully exploit the extra information, we
propose MapFormer, a novel architecture based on a multi-modal feature fusion
module that allows for feature processing conditioned on the available semantic
information. We further employ a supervised, cross-modal contrastive loss to
guide the learning of visual representations. Our approach outperforms existing
change detection methods by an absolute 11.7% and 18.4% in terms of binary
change IoU on DynamicEarthNet and HRSCD, respectively. Furthermore, we
demonstrate the robustness of our approach to the quality of the pre-change
semantic information and the absence pre-change imagery. The code will be made
publicly available
TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation
LiDAR semantic segmentation plays a crucial role in enabling autonomous
driving and robots to understand their surroundings accurately and robustly.
There are different types of methods, such as point-based, range-image-based,
polar-based, and hybrid methods. Among these, range-image-based methods are
widely used due to their efficiency. However, they face a significant challenge
known as the ``many-to-one'' problem caused by the range image's limited
horizontal and vertical angular resolution. As a result, around 20\% of the 3D
points can be occluded. In this paper, we present TFNet, a range-image-based
LiDAR semantic segmentation method that utilizes temporal information to
address this issue. Specifically, we incorporate a temporal fusion layer to
extract useful information from previous scans and integrate it with the
current scan. We then design a max-voting-based post-processing technique to
correct false predictions, particularly those caused by the ``many-to-one''
issue. We evaluated the approach on two benchmarks and demonstrate that the
post-processing technique is generic and can be applied to various networks. We
will release our code and models
Recommended from our members
Dynamic information processing states revealed through neurocognitive models of object semantics.
Recognising objects relies on highly dynamic, interactive brain networks to process multiple aspects of object information. To fully understand how different forms of information about objects are represented and processed in the brain requires a neurocognitive account of visual object recognition that combines a detailed cognitive model of semantic knowledge with a neurobiological model of visual object processing. Here we ask how specific cognitive factors are instantiated in our mental processes and how they dynamically evolve over time. We suggest that coarse semantic information, based on generic shared semantic knowledge, is rapidly extracted from visual inputs and is sufficient to drive rapid category decisions. Subsequent recurrent neural activity between the anterior temporal lobe and posterior fusiform supports the formation of object-specific semantic representations - a conjunctive process primarily driven by the perirhinal cortex. These object-specific representations require the integration of shared and distinguishing object properties and support the unique recognition of objects. We conclude that a valuable way of understanding the cognitive activity of the brain is though testing the relationship between specific cognitive measures and dynamic neural activity. This kind of approach allows us to move towards uncovering the information processing states of the brain and how they evolve over time.This is the fnal version. It was first published by Taylor and Francis at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4337742/
Multi-scale Spatial-temporal Interaction Network for Video Anomaly Detection
Video anomaly detection (VAD) is an essential yet challenge task in signal
processing. Since certain anomalies cannot be detected by analyzing temporal or
spatial information alone, the interaction between two types of information is
considered crucial for VAD. However, current dual-stream architectures either
limit interaction between the two types of information to the bottleneck of
autoencoder or incorporate background pixels irrelevant to anomalies into the
interaction. To this end, we propose a multi-scale spatial-temporal interaction
network (MSTI-Net) for VAD. First, to pay particular attention to objects and
reconcile the significant semantic differences between the two information, we
propose an attention-based spatial-temporal fusion module (ASTM) as a
substitute for the conventional direct fusion. Furthermore, we inject multi
ASTM-based connections between the appearance and motion pathways of a dual
stream network to facilitate spatial-temporal interaction at all possible
scales. Finally, the regular information learned from multiple scales is
recorded in memory to enhance the differentiation between anomalies and normal
events during the testing phase. Solid experimental results on three standard
datasets validate the effectiveness of our approach, which achieve AUCs of
96.8% for UCSD Ped2, 87.6% for CUHK Avenue, and 73.9% for the ShanghaiTech
dataset
Mechanisms of voice processing in dementia
Perception of nonverbal vocal information is essential in our daily lives. Patients with degenerative dementias commonly have difficulty with such aspects of vocal communication; however voice processing has seldom been studied in these diseases. This thesis comprises a series of linked studies of voice processing in canonical dementias: Alzheimer’s disease, behavioural variant frontotemporal dementia, semantic dementia and progressive nonfluent aphasia. A series of neuropsychological tests were developed to examine perceptual and semantic stages of voice processing and to assess two aspects of accent processing: comprehension of foreign accented speech and recognition of regional and foreign accents; patient performance was referenced to healthy control subjects. Neuroanatomical associations of voice processing performance were assessed using voxel based morphometry. Following a symptom-led approach, a syndrome of progressive associative phonagnosia was characterised in two detailed case studies. Following a disease-led approach, this work was extended systematically to cohorts of patients representing the target diseases and assessing voice processing in relation to other aspects of person recognition (faces and names). This work provided evidence for separable profiles of voice processing impairment in different diseases: associative deficits were particularly severe in semantic dementia, whilst perceptual deficits showed relative specificity for Alzheimer’s disease. Neuroanatomical associations were identified for voice recognition in the right temporal pole and anterior fusiform gyrus, and for voice discrimination in the right inferior parietal lobe. The final phase of this work addressed the neuropsychological and neuroanatomical basis of accent processing, as an important dimension of nonverbal vocal analysis that is not dependent on voice identity. This work provides evidence for impaired processing of accents in progressive nonfluent aphasia and Alzheimer’s with neuroanatomical associations in the anterior and superior temporal lobe. The thesis contributes new information about voice processing in the degenerative dementias and furthers our understanding of the mechanisms of human voice analysis
- …