Search CORE

130,437 research outputs found

MapFormer: Boosting Change Detection by Using Pre-change Information

Author: Bernhard Maximilian
Schubert Matthias
Strauß Niklas
Publication venue
Publication date: 31/03/2023
Field of study

Change detection in remote sensing imagery is essential for a variety of applications such as urban planning, disaster management, and climate research. However, existing methods for identifying semantically changed areas overlook the availability of semantic information in the form of existing maps describing features of the earth's surface. In this paper, we leverage this information for change detection in bi-temporal images. We show that the simple integration of the additional information via concatenation of latent representations suffices to significantly outperform state-of-the-art change detection methods. Motivated by this observation, we propose the new task of Conditional Change Detection, where pre-change semantic information is used as input next to bi-temporal images. To fully exploit the extra information, we propose MapFormer, a novel architecture based on a multi-modal feature fusion module that allows for feature processing conditioned on the available semantic information. We further employ a supervised, cross-modal contrastive loss to guide the learning of visual representations. Our approach outperforms existing change detection methods by an absolute 11.7% and 18.4% in terms of binary change IoU on DynamicEarthNet and HRSCD, respectively. Furthermore, we demonstrate the robustness of our approach to the quality of the pre-change semantic information and the absence pre-change imagery. The code will be made publicly available

arXiv.org e-Print Archive

TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation

Author: Chen Xieyuanli
Gall Juergen
Hao Wang
Li Rong
Li ShiJie
Liang Junwei
Ma Teli
Publication venue
Publication date: 16/09/2023
Field of study

LiDAR semantic segmentation plays a crucial role in enabling autonomous driving and robots to understand their surroundings accurately and robustly. There are different types of methods, such as point-based, range-image-based, polar-based, and hybrid methods. Among these, range-image-based methods are widely used due to their efficiency. However, they face a significant challenge known as the ``many-to-one'' problem caused by the range image's limited horizontal and vertical angular resolution. As a result, around 20\% of the 3D points can be occluded. In this paper, we present TFNet, a range-image-based LiDAR semantic segmentation method that utilizes temporal information to address this issue. Specifically, we incorporate a temporal fusion layer to extract useful information from previous scans and integrate it with the current scan. We then design a max-voting-based post-processing technique to correct false predictions, particularly those caused by the ``many-to-one'' issue. We evaluated the approach on two benchmarks and demonstrate that the post-processing technique is generic and can be applied to various networks. We will release our code and models

arXiv.org e-Print Archive

Recommended from our members

Dynamic information processing states revealed through neurocognitive models of object semantics.

Author: Clarke Alex
Publication venue: Lang Cogn Neurosci
Publication date: 21/04/2015
Field of study

Recognising objects relies on highly dynamic, interactive brain networks to process multiple aspects of object information. To fully understand how different forms of information about objects are represented and processed in the brain requires a neurocognitive account of visual object recognition that combines a detailed cognitive model of semantic knowledge with a neurobiological model of visual object processing. Here we ask how specific cognitive factors are instantiated in our mental processes and how they dynamically evolve over time. We suggest that coarse semantic information, based on generic shared semantic knowledge, is rapidly extracted from visual inputs and is sufficient to drive rapid category decisions. Subsequent recurrent neural activity between the anterior temporal lobe and posterior fusiform supports the formation of object-specific semantic representations - a conjunctive process primarily driven by the perirhinal cortex. These object-specific representations require the integration of shared and distinguishing object properties and support the unique recognition of objects. We conclude that a valuable way of understanding the cognitive activity of the brain is though testing the relationship between specific cognitive measures and dynamic neural activity. This kind of approach allows us to move towards uncovering the information processing states of the brain and how they evolve over time.This is the fnal version. It was first published by Taylor and Francis at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4337742/

Apollo (Cambridge)

Multi-scale Spatial-temporal Interaction Network for Video Anomaly Detection

Author: Li Zhangxun
Ning Zhiyuan
Song Liang
Publication venue
Publication date: 16/06/2023
Field of study

Video anomaly detection (VAD) is an essential yet challenge task in signal processing. Since certain anomalies cannot be detected by analyzing temporal or spatial information alone, the interaction between two types of information is considered crucial for VAD. However, current dual-stream architectures either limit interaction between the two types of information to the bottleneck of autoencoder or incorporate background pixels irrelevant to anomalies into the interaction. To this end, we propose a multi-scale spatial-temporal interaction network (MSTI-Net) for VAD. First, to pay particular attention to objects and reconcile the significant semantic differences between the two information, we propose an attention-based spatial-temporal fusion module (ASTM) as a substitute for the conventional direct fusion. Furthermore, we inject multi ASTM-based connections between the appearance and motion pathways of a dual stream network to facilitate spatial-temporal interaction at all possible scales. Finally, the regular information learned from multiple scales is recorded in memory to enhance the differentiation between anomalies and normal events during the testing phase. Solid experimental results on three standard datasets validate the effectiveness of our approach, which achieve AUCs of 96.8% for UCSD Ped2, 87.6% for CUHK Avenue, and 73.9% for the ShanghaiTech dataset

arXiv.org e-Print Archive

Mechanisms of voice processing in dementia

Author: Hailstone JC
Publication venue: UCL (University College London)
Publication date: 28/10/2012
Field of study

Perception of nonverbal vocal information is essential in our daily lives. Patients with degenerative dementias commonly have difficulty with such aspects of vocal communication; however voice processing has seldom been studied in these diseases. This thesis comprises a series of linked studies of voice processing in canonical dementias: Alzheimer’s disease, behavioural variant frontotemporal dementia, semantic dementia and progressive nonfluent aphasia. A series of neuropsychological tests were developed to examine perceptual and semantic stages of voice processing and to assess two aspects of accent processing: comprehension of foreign accented speech and recognition of regional and foreign accents; patient performance was referenced to healthy control subjects. Neuroanatomical associations of voice processing performance were assessed using voxel based morphometry. Following a symptom-led approach, a syndrome of progressive associative phonagnosia was characterised in two detailed case studies. Following a disease-led approach, this work was extended systematically to cohorts of patients representing the target diseases and assessing voice processing in relation to other aspects of person recognition (faces and names). This work provided evidence for separable profiles of voice processing impairment in different diseases: associative deficits were particularly severe in semantic dementia, whilst perceptual deficits showed relative specificity for Alzheimer’s disease. Neuroanatomical associations were identified for voice recognition in the right temporal pole and anterior fusiform gyrus, and for voice discrimination in the right inferior parietal lobe. The final phase of this work addressed the neuropsychological and neuroanatomical basis of accent processing, as an important dimension of nonverbal vocal analysis that is not dependent on voice identity. This work provides evidence for impaired processing of accents in progressive nonfluent aphasia and Alzheimer’s with neuroanatomical associations in the anterior and superior temporal lobe. The thesis contributes new information about voice processing in the degenerative dementias and furthers our understanding of the mechanisms of human voice analysis

UCL Discovery