1,203 research outputs found
Decoding Perception of Speech from Behavioral Responses using Spatio-Temporal CNNs
Categorical perception (CP) of speech is a complex process reflecting individuals’ ability to perceive sound and is measured using response time (RT). The cognitive processes involved in mapping neural activities to behavioral response are stochastic and further compounded by individuality and variations. This thesis presents a data-driven approach and develops parameter optimized models to understand the relationship between cognitive events and behavioral response (e.g., RT). We introduce convolutional neural networks (CNN) to learn the representation from EEG recordings. In addition, we develop parameter optimized and interpretable models in decoding CP using two representations: 1) spatial-spectral topomaps and 2) evoked response potentials (ERP). We adopt state-of-the-art class discriminative visualization (GradCAM) tools to gain insights (as oppose to the’black box’ models) and building interpretable models. In addition, we develop a diverse set of models to account for the stochasticity and individual variations. We adopted weighted saliency scores of all models to quantify the learned representations’ effectiveness and utility in decoding CP manifested through behavioral response. Empirical analysis reveals that the γ band and early (∼ 0 - 200ms) and late (∼ 300 - 500ms) right hemisphere IFG engagement is critical in determining individuals’ RT. Our observations are consistent with prior findings, further validating the efficacy of our data-driven approach and optimized interpretable models
Color in scientific visualization: Perception and image-based data display
Visualization is the transformation of information into a visual display that enhances users understanding and interpretation of the data. This thesis project has investigated the use of color and human vision modeling for visualization of image-based scientific data. Two preliminary psychophysical experiments were first conducted on uniform color patches to analyze the perception and understanding of different color attributes, which provided psychophysical evidence and guidance for the choice of color space/attributes for color encoding. Perceptual color scales were then designed for univariate and bivariate image data display and their effectiveness was evaluated through three psychophysical experiments. Some general guidelines were derived for effective color scales design. Extending to high-dimensional data, two visualization techniques were developed for hyperspectral imagery. The first approach takes advantage of the underlying relationships between PCA/ICA of hyperspectral images and the human opponent color model, and maps the first three PCs or ICs to several opponent color spaces including CIELAB, HSV, YCbCr, and YUV. The gray world assumption was adopted to automatically set the mapping origins. The rendered images are well color balanced and can offer a first look capability or initial classification for a wide variety of spectral scenes. The second approach combines a true color image and a PCA image based on a biologically inspired visual attention model that simulates the center-surround structure of visual receptive fields as the difference between fine and coarse scales. The model was extended to take into account human contrast sensitivity and include high-level information such as the second order statistical structure in the form of local variance map, in addition to low-level features such as color, luminance, and orientation. It generates a topographic saliency map for both the true color image and the PCA image, a difference map is then derived and used as a mask to select interesting locations where the PCA image has more salient features than available in the visible bands. The resulting representations preserve consistent natural appearance of the scene, while the selected attentional locations may be analyzed by more advanced algorithms
Plant disease identification using explainable 3D deep learning on hyperspectral images
Background
Hyperspectral imaging is emerging as a promising approach for plant disease identification. The large and possibly redundant information contained in hyperspectral data cubes makes deep learning based identification of plant diseases a natural fit. Here, we deploy a novel 3D deep convolutional neural network (DCNN) that directly assimilates the hyperspectral data. Furthermore, we interrogate the learnt model to produce physiologically meaningful explanations. We focus on an economically important disease, charcoal rot, which is a soil borne fungal disease that affects the yield of soybean crops worldwide.
Results
Based on hyperspectral imaging of inoculated and mock-inoculated stem images, our 3D DCNN has a classification accuracy of 95.73% and an infected class F1 score of 0.87. Using the concept of a saliency map, we visualize the most sensitive pixel locations, and show that the spatial regions with visible disease symptoms are overwhelmingly chosen by the model for classification. We also find that the most sensitive wavelengths used by the model for classification are in the near infrared region (NIR), which is also the commonly used spectral range for determining the vegetative health of a plant.
Conclusion
The use of an explainable deep learning model not only provides high accuracy, but also provides physiological insight into model predictions, thus generating confidence in model predictions. These explained predictions lend themselves for eventual use in precision agriculture and research application using automated phenotyping platforms
Graph learning and its applications : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science, Massey University, Albany, Auckland, New Zealand
Since graph features consider the correlations between two data points to provide high-order information, i.e., more complex correlations than the low-order information which considers the correlations in the individual data, they have attracted much attention in real applications. The key of graph feature extraction is the graph construction. Previous study has demonstrated that
the quality of the graph usually determines the effectiveness of the graph feature. However, the graph is usually constructed from the original data which often contain noise and redundancy. To address the above issue, graph learning is designed to iteratively adjust the graph and model parameters so that improving the quality of the graph and outputting optimal model parameters.
As a result, graph learning has become a very popular research topic in traditional machine learning and deep learning. Although previous graph learning methods have been applied in many fields by adding a graph regularization to the objective function, they still have some issues to be addressed.
This thesis focuses on the study of graph learning aiming to overcome the drawbacks in previous methods for different applications. We list the proposed methods as follows.
• We propose a traditional graph learning method under supervised learning to consider the robustness and the interpretability of graph learning. Specifically, we propose utilizing self-paced learning to assign important samples with large weights, conducting feature selection to remove redundant features, and learning a graph matrix from the low dimensional data of the original data to preserve the local structure of the data. As a consequence, both important samples and useful features are used to select support vectors in the SVM framework.
• We propose a traditional graph learning method under semi-supervised learning to explore parameter-free fusion of graph learning. Specifically, we first employ the discrete wavelet transform and Pearson correlation coefficient to obtain multiple fully connected Functional Connectivity brain Networks (FCNs) for every subject, and then learn a sparsely connected FCN for every subject. Finally, the ℓ1-SVM is employed to learn the important features and conduct disease diagnosis.
• We propose a deep graph learning method to consider graph fusion of graph learning. Specifically, we first employ the Simple Linear Iterative Clustering (SLIC) method to obtain multi-scale features for every image, and then design a new graph fusion method to fine-tune features of every scale. As a result, the multi-scale feature fine-tuning, graph learning, and feature learning are embedded into a unified framework. All proposed methods are evaluated on real-world data sets, by comparing to state-of-the-art methods. Experimental results demonstrate that our methods outperformed all comparison methods
3D time series analysis of cell shape using Laplacian approaches
Background:
Fundamental cellular processes such as cell movement, division or food uptake critically depend on cells being able to change shape. Fast acquisition of three-dimensional image time series has now become possible, but we lack efficient tools for analysing shape deformations in order to understand the real three-dimensional nature of shape changes.
Results:
We present a framework for 3D+time cell shape analysis. The main contribution is three-fold: First, we develop a fast, automatic random walker method for cell segmentation. Second, a novel topology fixing method is proposed to fix segmented binary volumes without spherical topology. Third, we show that algorithms used for each individual step of the analysis pipeline (cell segmentation, topology fixing, spherical parameterization, and shape representation) are closely related to the Laplacian operator. The framework is applied to the shape analysis of neutrophil cells.
Conclusions:
The method we propose for cell segmentation is faster than the traditional random walker method or the level set method, and performs better on 3D time-series of neutrophil cells, which are comparatively noisy as stacks have to be acquired fast enough to account for cell motion. Our method for topology fixing outperforms the tools provided by SPHARM-MAT and SPHARM-PDM in terms of their successful fixing rates. The different tasks in the presented pipeline for 3D+time shape analysis of cells can be solved using Laplacian approaches, opening the possibility of eventually combining individual steps in order to speed up computations
Eight-Channel Multispectral Image Database for Saliency Prediction
Saliency prediction is a very important and challenging task within the computer vision
community. Many models exist that try to predict the salient regions on a scene from its RGB
image values. Several new models are developed, and spectral imaging techniques may potentially
overcome the limitations found when using RGB images. However, the experimental study of
such models based on spectral images is difficult because of the lack of available data to work
with. This article presents the first eight-channel multispectral image database of outdoor urban
scenes together with their gaze data recorded using an eyetracker over several observers performing
different visualization tasks. Besides, the information from this database is used to study whether
the complexity of the images has an impact on the saliency maps retrieved from the observers.
Results show that more complex images do not correlate with higher differences in the saliency
maps obtained.Spanish Ministry of Science, Innovation, and Universities (MICINN)
RTI2018-094738-B-I00European Commissio
A Psychophysical Oriented Saliency Map Prediction Model
Visual attention is one of the most significant characteristics for selecting
and understanding the outside redundancy world. The human vision system cannot
process all information simultaneously, due to the visual information
bottleneck. In order to reduce the redundant input of visual information, the
human visual system mainly focuses on dominant parts of scenes. This is
commonly known as visual saliency map prediction. This paper proposed a new
psychophysical saliency prediction architecture, WECSF, inspired by
multi-channel model of visual cortex functioning in humans. The model consists
of opponent color channels, wavelet transform, wavelet energy map, and contrast
sensitivity function for extracting low-level image features and providing
maximum approximation to the human visual system. The proposed model is
evaluated using several datasets, including the MIT1003, MIT300, TORONTO,
SID4VAM, and UCF Sports datasets. We also quantitatively and qualitatively
compare the saliency prediction performance with that of other state-of-the-art
models. Our model achieved strongly stable and better performance with
different metrics on nature images, psychophysical synthetic images and dynamic
videos. Additionally, we found that Fourier and spectral-inspired saliency
prediction models outperformed other state-of-the-art non-neural network and
even deep neural network models on psychophysical synthetic images, it can be
explained and supported the Fourier Vision Hypothesis. Finally, the proposed
model could be used as a computational model of primate vision system and help
us understand mechanism of vision system
Explaining hyperspectral imaging based plant disease identification: 3D CNN and saliency maps
Our overarching goal is to develop an accurate and explainable model for
plant disease identification using hyperspectral data. Charcoal rot is a soil
borne fungal disease that affects the yield of soybean crops worldwide.
Hyperspectral images were captured at 240 different wavelengths in the range of
383 - 1032 nm. We developed a 3D Convolutional Neural Network model for soybean
charcoal rot disease identification. Our model has classification accuracy of
95.73\% and an infected class F1 score of 0.87. We infer the trained model
using saliency map and visualize the most sensitive pixel locations that enable
classification. The sensitivity of individual wavelengths for classification
was also determined using the saliency map visualization. We identify the most
sensitive wavelength as 733 nm using the saliency map visualization. Since the
most sensitive wavelength is in the Near Infrared Region(700 - 1000 nm) of the
electromagnetic spectrum, which is also the commonly used spectrum region for
determining the vegetation health of the plant, we were more confident in the
predictions using our model
- …