9 research outputs found

    Top-Down Selection in Convolutional Neural Networks

    Get PDF
    Feedforward information processing fills the role of hierarchical feature encoding, transformation, reduction, and abstraction in a bottom-up manner. This paradigm of information processing is sufficient for task requirements that are satisfied in the one-shot rapid traversal of sensory information through the visual hierarchy. However, some tasks demand higher-order information processing using short-term recurrent, long-range feedback, or other processes. The predictive, corrective, and modulatory information processing in top-down fashion complement the feedforward pass to fulfill many complex task requirements. Convolutional neural networks have recently been successful in addressing some aspects of the feedforward processing. However, the role of top-down processing in such models has not yet been fully understood. We propose a top-down selection framework for convolutional neural networks to address the selective and modulatory nature of top-down processing in vision systems. We examine various aspects of the proposed model in different experimental settings such as object localization, object segmentation, task priming, compact neural representation, and contextual interference reduction. We test the hypothesis that the proposed approach is capable of accomplishing hierarchical feature localization according to task cuing. Additionally, feature modulation using the proposed approach is tested for demanding tasks such as segmentation and iterative parameter fine-tuning. Moreover, the top-down attentional traces are harnessed to enable a more compact neural representation. The experimental achievements support the practical complementary role of the top-down selection mechanisms to the bottom-up feature encoding routines

    Priming Neural Networks

    Full text link
    Visual priming is known to affect the human visual system to allow detection of scene elements, even those that may have been near unnoticeable before, such as the presence of camouflaged animals. This process has been shown to be an effect of top-down signaling in the visual system triggered by the said cue. In this paper, we propose a mechanism to mimic the process of priming in the context of object detection and segmentation. We view priming as having a modulatory, cue dependent effect on layers of features within a network. Our results show how such a process can be complementary to, and at times more effective than simple post-processing applied to the output of the network, notably so in cases where the object is hard to detect such as in severe noise. Moreover, we find the effects of priming are sometimes stronger when early visual layers are affected. Overall, our experiments confirm that top-down signals can go a long way in improving object detection and segmentation.Comment: fixed error in author nam

    Novel Multistage Probabilistic Kernel Modeling in Handwriting Recognition

    Get PDF
    The design of handwriting recognition systems has been widely investigated in pattern recognition and machine learning literature. It was first attempted to enhance the system's performance by improving the recognition rate to reach 100%100\% which has not achieved yet. Despite the low misclassification error rate, there are still some misclassified test samples. This imposes a very high cost on the whole recognition system. The cost has to be reduced as much as possible which consequently leads to the consideration of reject option to prevent the recognition system from classifying test samples with high prediction uncertainty. The main contribution of this thesis is to propose a novel multistage recognition system that is capable of producing true prediction probability outputs and then reject test samples accordingly. An argument is supported that principally formulated probabilistic classifiers are the best reliable candidates to be utilized in the consideration of reject option. The implementation of reject option based on either non-probabilistic classifier's output score or conversion to probability measures is prone to mistake when compared to an accurate prediction probability output. The Convolutional Neural Network (CNN) is utilized as the automatic feature extractor that can properly harness the spatial correlation of the input raw handwritten images and extract a feature vector with strong discriminative properties. The SVM is used as a powerful classifier to accurately deal with the issue of big data sets. The authentic intuition of extracting the most informative training samples by using the distinguished support vector set from the SVM is also proposed. The Gaussian process classifier (GPC) in the Bayesian nonparametric modeling framework is introduced as the core element of the whole recognition system that can reliably provide an accurate estimate of the posterior probability of the class membership. Experiments under various inference methods, likelihood functions, covariance functions, and learning approaches are conducted in the hope of finding the best model configuration and parameterization. The models are evaluated on two popular handwritten numeral data sets known as MNIST and CENPARMI. The best GPC model in this multistage framework on MNIST can reach 100%100\% reliability rate with the lowest rejection rate of 1.48%1.48\%, the best result achieved in the field. Another inherently probabilistic classifier, known as relevance vector machine (RVM), is also investigated. The RVM is formulated through the sparse Bayesian linear modeling to classification problems and it produces reliable prediction probability outputs. However, In comparison of the GPC with RVM, this argument is experimentally supported that the sparsity is not capable of improving the rejection performance on the data sets

    DyG2Vec: Representation Learning for Dynamic Graphs with Self-Supervision

    Full text link
    Temporal graph neural networks have shown promising results in learning inductive representations by automatically extracting temporal patterns. However, previous works often rely on complex memory modules or inefficient random walk methods to construct temporal representations. In addition, the existing dynamic graph encoders are non-trivial to adapt to self-supervised paradigms, which prevents them from utilizing unlabeled data. To address these limitations, we present an efficient yet effective attention-based encoder that leverages temporal edge encodings and window-based subgraph sampling to generate task-agnostic embeddings. Moreover, we propose a joint-embedding architecture using non-contrastive SSL to learn rich temporal embeddings without labels. Experimental results on 7 benchmark datasets indicate that on average, our model outperforms SoTA baselines on the future link prediction task by 4.23% for the transductive setting and 3.30% for the inductive setting while only requiring 5-10x less training/inference time. Additionally, we empirically validate the SSL pre-training significance under two probings commonly used in language and vision modalities. Lastly, different aspects of the proposed framework are investigated through experimental analysis and ablation studies.Comment: Proceedings of the 19th International Workshop on Mining and Learning with Graphs (MLG

    Top-Down Selection in Convolutional Neural Networks

    Get PDF
    Feedforward information processing fills the role of hierarchical feature encoding, transformation, reduction, and abstraction in a bottom-up manner. This paradigm of information processing is sufficient for task requirements that are satisfied in the one-shot rapid traversal of sensory information through the visual hierarchy. However, some tasks demand higher-order information processing using short-term recurrent, long-range feedback, or other processes. The predictive, corrective, and modulatory information processing in top-down fashion complement the feedforward pass to fulfill many complex task requirements. Convolutional neural networks have recently been successful in addressing some aspects of the feedforward processing. However, the role of top-down processing in such models has not yet been fully understood. We propose a top-down selection framework for convolutional neural networks to address the selective and modulatory nature of top-down processing in vision systems. We examine various aspects of the proposed model in different experimental settings such as object localization, object segmentation, task priming, compact neural representation, and contextual interference reduction. We test the hypothesis that the proposed approach is capable of accomplishing hierarchical feature localization according to task cuing. Additionally, feature modulation using the proposed approach is tested for demanding tasks such as segmentation and iterative parameter fine-tuning. Moreover, the top-down attentional traces are harnessed to enable a more compact neural representation. The experimental achievements support the practical complementary role of the top-down selection mechanisms to the bottom-up feature encoding routines

    ROOD-MRI: Benchmarking the robustness of deep learning segmentation models to out-of-distribution and corrupted data in MRI

    No full text
    Deep artificial neural networks (DNNs) have moved to the forefront of medical image analysis due to their success in classification, segmentation, and detection challenges. A principal challenge in large-scale deployment of DNNs in neuroimage analysis is the potential for shifts in signal-to-noise ratio, contrast, resolution, and presence of artifacts from site to site due to variances in scanners and acquisition protocols. DNNs are famously susceptible to these distribution shifts in computer vision. Currently, there are no benchmarking platforms or frameworks to assess the robustness of new and existing models to specific distribution shifts in MRI, and accessible multi-site benchmarking datasets are still scarce or task-specific. To address these limitations, we propose ROOD-MRI: a novel platform for benchmarking the Robustness of DNNs to Out-Of-Distribution (OOD) data, corruptions, and artifacts in MRI. This flexible platform provides modules for generating benchmarking datasets using transforms that model distribution shifts in MRI, implementations of newly derived benchmarking metrics for image segmentation, and examples for using the methodology with new models and tasks. We apply our methodology to hippocampus, ventricle, and white matter hyperintensity segmentation in several large studies, providing the hippocampus dataset as a publicly available benchmark. By evaluating modern DNNs on these datasets, we demonstrate that they are highly susceptible to distribution shifts and corruptions in MRI. We show that while data augmentation strategies can substantially improve robustness to OOD data for anatomical segmentation tasks, modern DNNs using augmentation still lack robustness in more challenging lesion-based segmentation tasks. We finally benchmark U-Nets and vision transformers, finding robustness susceptibility to particular classes of transforms across architectures. The presented open-source platform enables generating new benchmarking datasets and comparing across models to study model design that results in improved robustness to OOD data and corruptions in MRI
    corecore