71 research outputs found

    Deep Adaptive Feature Embedding with Local Sample Distributions for Person Re-identification

    Full text link
    Person re-identification (re-id) aims to match pedestrians observed by disjoint camera views. It attracts increasing attention in computer vision due to its importance to surveillance system. To combat the major challenge of cross-view visual variations, deep embedding approaches are proposed by learning a compact feature space from images such that the Euclidean distances correspond to their cross-view similarity metric. However, the global Euclidean distance cannot faithfully characterize the ideal similarity in a complex visual feature space because features of pedestrian images exhibit unknown distributions due to large variations in poses, illumination and occlusion. Moreover, intra-personal training samples within a local range are robust to guide deep embedding against uncontrolled variations, which however, cannot be captured by a global Euclidean distance. In this paper, we study the problem of person re-id by proposing a novel sampling to mine suitable \textit{positives} (i.e. intra-class) within a local range to improve the deep embedding in the context of large intra-class variations. Our method is capable of learning a deep similarity metric adaptive to local sample structure by minimizing each sample's local distances while propagating through the relationship between samples to attain the whole intra-class minimization. To this end, a novel objective function is proposed to jointly optimize similarity metric learning, local positive mining and robust deep embedding. This yields local discriminations by selecting local-ranged positive samples, and the learned features are robust to dramatic intra-class variations. Experiments on benchmarks show state-of-the-art results achieved by our method.Comment: Published on Pattern Recognitio

    What-and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re-identification

    Full text link
    Matching pedestrians across disjoint camera views, known as person re-identification (re-id), is a challenging problem that is of importance to visual recognition and surveillance. Most existing methods exploit local regions within spatial manipulation to perform matching in local correspondence. However, they essentially extract \emph{fixed} representations from pre-divided regions for each image and perform matching based on the extracted representation subsequently. For models in this pipeline, local finer patterns that are crucial to distinguish positive pairs from negative ones cannot be captured, and thus making them underperformed. In this paper, we propose a novel deep multiplicative integration gating function, which answers the question of \emph{what-and-where to match} for effective person re-id. To address \emph{what} to match, our deep network emphasizes common local patterns by learning joint representations in a multiplicative way. The network comprises two Convolutional Neural Networks (CNNs) to extract convolutional activations, and generates relevant descriptors for pedestrian matching. This thus, leads to flexible representations for pair-wise images. To address \emph{where} to match, we combat the spatial misalignment by performing spatially recurrent pooling via a four-directional recurrent neural network to impose spatial dependency over all positions with respect to the entire image. The proposed network is designed to be end-to-end trainable to characterize local pairwise feature interactions in a spatially aligned manner. To demonstrate the superiority of our method, extensive experiments are conducted over three benchmark data sets: VIPeR, CUHK03 and Market-1501.Comment: Published at Pattern Recognition, Elsevie

    From Continuous Dynamics to Graph Neural Networks: Neural Diffusion and Beyond

    Full text link
    Graph neural networks (GNNs) have demonstrated significant promise in modelling relational data and have been widely applied in various fields of interest. The key mechanism behind GNNs is the so-called message passing where information is being iteratively aggregated to central nodes from their neighbourhood. Such a scheme has been found to be intrinsically linked to a physical process known as heat diffusion, where the propagation of GNNs naturally corresponds to the evolution of heat density. Analogizing the process of message passing to the heat dynamics allows to fundamentally understand the power and pitfalls of GNNs and consequently informs better model design. Recently, there emerges a plethora of works that proposes GNNs inspired from the continuous dynamics formulation, in an attempt to mitigate the known limitations of GNNs, such as oversmoothing and oversquashing. In this survey, we provide the first systematic and comprehensive review of studies that leverage the continuous perspective of GNNs. To this end, we introduce foundational ingredients for adapting continuous dynamics to GNNs, along with a general framework for the design of graph neural dynamics. We then review and categorize existing works based on their driven mechanisms and underlying dynamics. We also summarize how the limitations of classic GNNs can be addressed under the continuous framework. We conclude by identifying multiple open research directions

    Bregman Graph Neural Network

    Full text link
    Numerous recent research on graph neural networks (GNNs) has focused on formulating GNN architectures as an optimization problem with the smoothness assumption. However, in node classification tasks, the smoothing effect induced by GNNs tends to assimilate representations and over-homogenize labels of connected nodes, leading to adverse effects such as over-smoothing and misclassification. In this paper, we propose a novel bilevel optimization framework for GNNs inspired by the notion of Bregman distance. We demonstrate that the GNN layer proposed accordingly can effectively mitigate the over-smoothing issue by introducing a mechanism reminiscent of the "skip connection". We validate our theoretical results through comprehensive empirical studies in which Bregman-enhanced GNNs outperform their original counterparts in both homophilic and heterophilic graphs. Furthermore, our experiments also show that Bregman GNNs can produce more robust learning accuracy even when the number of layers is high, suggesting the effectiveness of the proposed method in alleviating the over-smoothing issue

    FedGST:Federated Graph Spatio-Temporal Framework for Brain Functional Disease Prediction

    Get PDF
    Currently, most medical institutions face the challenge of training a unified model using fragmented and isolated data to address disease prediction problems. Although federated learning has become the recognized paradigm for privacy-preserving model training, how to integrate federated learning with fMRI temporal characteristics to enhance predictive performance remains an open question for functional disease prediction. To address this challenging task, we propose a novel Federated Graph Spatio-Temporal (FedGST) framework for brain functional disease prediction. Specifically, anchor sampling is used to process variable-length time series data on local clients. Then dynamic functional connectivity graphs are generated via sliding windows and Pearson correlation coefficients. Next, we propose an InceptionTime model to extract temporal information from the dynamic functional connectivity graphs on the local clients. Finally, the hidden activation variables are sent to a global server. We propose a UniteGCN model on the global server to receive and process the hidden activation variables from clients. Then, the global server returns gradient information to clients for backpropagation and model parameter updating. Client models aggregate model parameters on the local server and distribute them to clients for the next round of training. We demonstrate that FedGST outperforms other federated learning methods and baselines on ABIDE-1 and ADHD200 datasets.</p

    FedGST:Federated Graph Spatio-Temporal Framework for Brain Functional Disease Prediction

    Get PDF
    Currently, most medical institutions face the challenge of training a unified model using fragmented and isolated data to address disease prediction problems. Although federated learning has become the recognized paradigm for privacy-preserving model training, how to integrate federated learning with fMRI temporal characteristics to enhance predictive performance remains an open question for functional disease prediction. To address this challenging task, we propose a novel Federated Graph Spatio-Temporal (FedGST) framework for brain functional disease prediction. Specifically, anchor sampling is used to process variable-length time series data on local clients. Then dynamic functional connectivity graphs are generated via sliding windows and Pearson correlation coefficients. Next, we propose an InceptionTime model to extract temporal information from the dynamic functional connectivity graphs on the local clients. Finally, the hidden activation variables are sent to a global server. We propose a UniteGCN model on the global server to receive and process the hidden activation variables from clients. Then, the global server returns gradient information to clients for backpropagation and model parameter updating. Client models aggregate model parameters on the local server and distribute them to clients for the next round of training. We demonstrate that FedGST outperforms other federated learning methods and baselines on ABIDE-1 and ADHD200 datasets.</p

    Exposition on over-squashing problem on GNNs: Current Methods, Benchmarks and Challenges

    Full text link
    Graph-based message-passing neural networks (MPNNs) have achieved remarkable success in both node and graph-level learning tasks. However, several identified problems, including over-smoothing (OSM), limited expressive power, and over-squashing (OSQ), still limit the performance of MPNNs. In particular, OSQ serves as the latest identified problem, where MPNNs gradually lose their learning accuracy when long-range dependencies between graph nodes are required. In this work, we provide an exposition on the OSQ problem by summarizing different formulations of OSQ from current literature, as well as the three different categories of approaches for addressing the OSQ problem. In addition, we also discuss the alignment between OSQ and expressive power and the trade-off between OSQ and OSM. Furthermore, we summarize the empirical methods leveraged from existing works to verify the efficiency of OSQ mitigation approaches, with illustrations of their computational complexities. Lastly, we list some open questions that are of interest for further exploration of the OSQ problem along with potential directions from the best of our knowledge
    • …
    corecore