9 research outputs found
Top-Down Selection in Convolutional Neural Networks
Feedforward information processing fills the role of hierarchical feature encoding, transformation, reduction, and abstraction in a bottom-up manner. This paradigm of information processing is sufficient for task requirements that are satisfied in the one-shot rapid traversal of sensory information through the visual hierarchy. However, some tasks demand higher-order information processing using short-term recurrent, long-range feedback, or other processes. The predictive, corrective, and modulatory information processing in top-down fashion complement the feedforward pass to fulfill many complex task requirements. Convolutional neural networks have recently been successful in addressing some aspects of the feedforward processing. However, the role of top-down processing in such models has not yet been fully understood. We propose a top-down selection framework for convolutional neural networks to address the selective and modulatory nature of top-down processing in vision systems. We examine various aspects of the proposed model in different experimental settings such as object localization, object segmentation, task priming, compact neural representation, and contextual interference reduction. We test the hypothesis that the proposed approach is capable of accomplishing hierarchical feature localization according to task cuing. Additionally, feature modulation using the proposed approach is tested for demanding tasks such as segmentation and iterative parameter fine-tuning. Moreover, the top-down attentional traces are harnessed to enable a more compact neural representation. The experimental achievements support the practical complementary role of the top-down selection mechanisms to the bottom-up feature encoding routines
Priming Neural Networks
Visual priming is known to affect the human visual system to allow detection
of scene elements, even those that may have been near unnoticeable before, such
as the presence of camouflaged animals. This process has been shown to be an
effect of top-down signaling in the visual system triggered by the said cue. In
this paper, we propose a mechanism to mimic the process of priming in the
context of object detection and segmentation. We view priming as having a
modulatory, cue dependent effect on layers of features within a network. Our
results show how such a process can be complementary to, and at times more
effective than simple post-processing applied to the output of the network,
notably so in cases where the object is hard to detect such as in severe noise.
Moreover, we find the effects of priming are sometimes stronger when early
visual layers are affected. Overall, our experiments confirm that top-down
signals can go a long way in improving object detection and segmentation.Comment: fixed error in author nam
Novel Multistage Probabilistic Kernel Modeling in Handwriting Recognition
The design of handwriting recognition systems has been widely investigated in pattern recognition and machine learning literature. It was first attempted to enhance the system's performance by improving the recognition rate to reach which has not achieved yet. Despite the low misclassification error rate, there are still some misclassified test samples. This imposes a very high cost on the whole recognition system. The cost has to be reduced as much as possible which consequently leads to the consideration of reject option to prevent the recognition system from classifying test samples with high prediction uncertainty.
The main contribution of this thesis is to propose a novel multistage recognition system that is capable of producing true prediction probability outputs and then reject test samples accordingly. An argument is supported that principally formulated probabilistic classifiers are the best reliable candidates to be utilized in the consideration of reject option. The implementation of reject option based on either non-probabilistic classifier's output score or conversion to probability measures is prone to mistake when compared to an accurate prediction probability output.
The Convolutional Neural Network (CNN) is utilized as the automatic feature extractor that can properly harness the spatial correlation of the input raw handwritten images and extract a feature vector with strong discriminative properties. The SVM is used as a powerful classifier to accurately deal with the issue of big data sets. The authentic intuition of extracting the most informative training samples by using the distinguished support vector set from the SVM is also proposed.
The Gaussian process classifier (GPC) in the Bayesian nonparametric modeling framework is introduced as the core element of the whole recognition system that can reliably provide an accurate estimate of the posterior probability of the class membership. Experiments under various inference methods, likelihood functions, covariance functions, and learning approaches are conducted in the hope of finding the best model configuration and parameterization. The models are evaluated on two popular handwritten numeral data sets known as MNIST and CENPARMI. The best GPC model in this multistage framework on MNIST can reach reliability rate with the lowest rejection rate of , the best result achieved in the field.
Another inherently probabilistic classifier, known as relevance vector machine (RVM), is also investigated. The RVM is formulated through the sparse Bayesian linear modeling to classification problems and it produces reliable prediction probability outputs. However, In comparison of the GPC with RVM, this argument is experimentally supported that the sparsity is not capable of improving the rejection performance on the data sets
DyG2Vec: Representation Learning for Dynamic Graphs with Self-Supervision
Temporal graph neural networks have shown promising results in learning
inductive representations by automatically extracting temporal patterns.
However, previous works often rely on complex memory modules or inefficient
random walk methods to construct temporal representations. In addition, the
existing dynamic graph encoders are non-trivial to adapt to self-supervised
paradigms, which prevents them from utilizing unlabeled data. To address these
limitations, we present an efficient yet effective attention-based encoder that
leverages temporal edge encodings and window-based subgraph sampling to
generate task-agnostic embeddings. Moreover, we propose a joint-embedding
architecture using non-contrastive SSL to learn rich temporal embeddings
without labels. Experimental results on 7 benchmark datasets indicate that on
average, our model outperforms SoTA baselines on the future link prediction
task by 4.23% for the transductive setting and 3.30% for the inductive setting
while only requiring 5-10x less training/inference time. Additionally, we
empirically validate the SSL pre-training significance under two probings
commonly used in language and vision modalities. Lastly, different aspects of
the proposed framework are investigated through experimental analysis and
ablation studies.Comment: Proceedings of the 19th International Workshop on Mining and Learning
with Graphs (MLG
Top-Down Selection in Convolutional Neural Networks
Feedforward information processing fills the role of hierarchical feature encoding, transformation, reduction, and abstraction in a bottom-up manner. This paradigm of information processing is sufficient for task requirements that are satisfied in the one-shot rapid traversal of sensory information through the visual hierarchy. However, some tasks demand higher-order information processing using short-term recurrent, long-range feedback, or other processes. The predictive, corrective, and modulatory information processing in top-down fashion complement the feedforward pass to fulfill many complex task requirements. Convolutional neural networks have recently been successful in addressing some aspects of the feedforward processing. However, the role of top-down processing in such models has not yet been fully understood. We propose a top-down selection framework for convolutional neural networks to address the selective and modulatory nature of top-down processing in vision systems. We examine various aspects of the proposed model in different experimental settings such as object localization, object segmentation, task priming, compact neural representation, and contextual interference reduction. We test the hypothesis that the proposed approach is capable of accomplishing hierarchical feature localization according to task cuing. Additionally, feature modulation using the proposed approach is tested for demanding tasks such as segmentation and iterative parameter fine-tuning. Moreover, the top-down attentional traces are harnessed to enable a more compact neural representation. The experimental achievements support the practical complementary role of the top-down selection mechanisms to the bottom-up feature encoding routines
ROOD-MRI: Benchmarking the robustness of deep learning segmentation models to out-of-distribution and corrupted data in MRI
Deep artificial neural networks (DNNs) have moved to the forefront of medical image analysis due to their success in classification, segmentation, and detection challenges. A principal challenge in large-scale deployment of DNNs in neuroimage analysis is the potential for shifts in signal-to-noise ratio, contrast, resolution, and presence of artifacts from site to site due to variances in scanners and acquisition protocols. DNNs are famously susceptible to these distribution shifts in computer vision. Currently, there are no benchmarking platforms or frameworks to assess the robustness of new and existing models to specific distribution shifts in MRI, and accessible multi-site benchmarking datasets are still scarce or task-specific. To address these limitations, we propose ROOD-MRI: a novel platform for benchmarking the Robustness of DNNs to Out-Of-Distribution (OOD) data, corruptions, and artifacts in MRI. This flexible platform provides modules for generating benchmarking datasets using transforms that model distribution shifts in MRI, implementations of newly derived benchmarking metrics for image segmentation, and examples for using the methodology with new models and tasks. We apply our methodology to hippocampus, ventricle, and white matter hyperintensity segmentation in several large studies, providing the hippocampus dataset as a publicly available benchmark. By evaluating modern DNNs on these datasets, we demonstrate that they are highly susceptible to distribution shifts and corruptions in MRI. We show that while data augmentation strategies can substantially improve robustness to OOD data for anatomical segmentation tasks, modern DNNs using augmentation still lack robustness in more challenging lesion-based segmentation tasks. We finally benchmark U-Nets and vision transformers, finding robustness susceptibility to particular classes of transforms across architectures. The presented open-source platform enables generating new benchmarking datasets and comparing across models to study model design that results in improved robustness to OOD data and corruptions in MRI