130,437 research outputs found

    MapFormer: Boosting Change Detection by Using Pre-change Information

    Full text link
    Change detection in remote sensing imagery is essential for a variety of applications such as urban planning, disaster management, and climate research. However, existing methods for identifying semantically changed areas overlook the availability of semantic information in the form of existing maps describing features of the earth's surface. In this paper, we leverage this information for change detection in bi-temporal images. We show that the simple integration of the additional information via concatenation of latent representations suffices to significantly outperform state-of-the-art change detection methods. Motivated by this observation, we propose the new task of Conditional Change Detection, where pre-change semantic information is used as input next to bi-temporal images. To fully exploit the extra information, we propose MapFormer, a novel architecture based on a multi-modal feature fusion module that allows for feature processing conditioned on the available semantic information. We further employ a supervised, cross-modal contrastive loss to guide the learning of visual representations. Our approach outperforms existing change detection methods by an absolute 11.7% and 18.4% in terms of binary change IoU on DynamicEarthNet and HRSCD, respectively. Furthermore, we demonstrate the robustness of our approach to the quality of the pre-change semantic information and the absence pre-change imagery. The code will be made publicly available

    TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation

    Full text link
    LiDAR semantic segmentation plays a crucial role in enabling autonomous driving and robots to understand their surroundings accurately and robustly. There are different types of methods, such as point-based, range-image-based, polar-based, and hybrid methods. Among these, range-image-based methods are widely used due to their efficiency. However, they face a significant challenge known as the ``many-to-one'' problem caused by the range image's limited horizontal and vertical angular resolution. As a result, around 20\% of the 3D points can be occluded. In this paper, we present TFNet, a range-image-based LiDAR semantic segmentation method that utilizes temporal information to address this issue. Specifically, we incorporate a temporal fusion layer to extract useful information from previous scans and integrate it with the current scan. We then design a max-voting-based post-processing technique to correct false predictions, particularly those caused by the ``many-to-one'' issue. We evaluated the approach on two benchmarks and demonstrate that the post-processing technique is generic and can be applied to various networks. We will release our code and models

    Multi-scale Spatial-temporal Interaction Network for Video Anomaly Detection

    Full text link
    Video anomaly detection (VAD) is an essential yet challenge task in signal processing. Since certain anomalies cannot be detected by analyzing temporal or spatial information alone, the interaction between two types of information is considered crucial for VAD. However, current dual-stream architectures either limit interaction between the two types of information to the bottleneck of autoencoder or incorporate background pixels irrelevant to anomalies into the interaction. To this end, we propose a multi-scale spatial-temporal interaction network (MSTI-Net) for VAD. First, to pay particular attention to objects and reconcile the significant semantic differences between the two information, we propose an attention-based spatial-temporal fusion module (ASTM) as a substitute for the conventional direct fusion. Furthermore, we inject multi ASTM-based connections between the appearance and motion pathways of a dual stream network to facilitate spatial-temporal interaction at all possible scales. Finally, the regular information learned from multiple scales is recorded in memory to enhance the differentiation between anomalies and normal events during the testing phase. Solid experimental results on three standard datasets validate the effectiveness of our approach, which achieve AUCs of 96.8% for UCSD Ped2, 87.6% for CUHK Avenue, and 73.9% for the ShanghaiTech dataset

    Mechanisms of voice processing in dementia

    Get PDF
    Perception of nonverbal vocal information is essential in our daily lives. Patients with degenerative dementias commonly have difficulty with such aspects of vocal communication; however voice processing has seldom been studied in these diseases. This thesis comprises a series of linked studies of voice processing in canonical dementias: Alzheimer’s disease, behavioural variant frontotemporal dementia, semantic dementia and progressive nonfluent aphasia. A series of neuropsychological tests were developed to examine perceptual and semantic stages of voice processing and to assess two aspects of accent processing: comprehension of foreign accented speech and recognition of regional and foreign accents; patient performance was referenced to healthy control subjects. Neuroanatomical associations of voice processing performance were assessed using voxel based morphometry. Following a symptom-led approach, a syndrome of progressive associative phonagnosia was characterised in two detailed case studies. Following a disease-led approach, this work was extended systematically to cohorts of patients representing the target diseases and assessing voice processing in relation to other aspects of person recognition (faces and names). This work provided evidence for separable profiles of voice processing impairment in different diseases: associative deficits were particularly severe in semantic dementia, whilst perceptual deficits showed relative specificity for Alzheimer’s disease. Neuroanatomical associations were identified for voice recognition in the right temporal pole and anterior fusiform gyrus, and for voice discrimination in the right inferior parietal lobe. The final phase of this work addressed the neuropsychological and neuroanatomical basis of accent processing, as an important dimension of nonverbal vocal analysis that is not dependent on voice identity. This work provides evidence for impaired processing of accents in progressive nonfluent aphasia and Alzheimer’s with neuroanatomical associations in the anterior and superior temporal lobe. The thesis contributes new information about voice processing in the degenerative dementias and furthers our understanding of the mechanisms of human voice analysis
    • …
    corecore