582 research outputs found
Backwards is the way forward: feedback in the cortical hierarchy predicts the expected future
Clark offers a powerful description of the brain as a prediction machine, which offers progress on two distinct levels. First, on an abstract conceptual level, it provides a unifying framework for perception, action, and cognition (including subdivisions such as attention, expectation, and imagination). Second, hierarchical prediction offers progress on a concrete descriptive level for testing and constraining conceptual elements and mechanisms of predictive coding models (estimation of predictions, prediction errors, and internal models)
Recommended from our members
Single Channel auditory source separation with neural network
Although distinguishing different sounds in noisy environment is a relative easy task for human, source separation has long been extremely difficult in audio signal processing. The problem is challenging for three reasons: the large variety of sound type, the abundant mixing conditions and the unclear mechanism to distinguish sources, especially for similar sounds.
In recent years, the neural network based methods achieved impressive successes in various problems, including the speech enhancement, where the task is to separate the clean speech out of the noise mixture. However, the current deep learning based source separator does not perform well on real recorded noisy speech, and more importantly, is not applicable in a more general source separation scenario such as overlapped speech.
In this thesis, we firstly propose extensions for the current mask learning network, for the problem of speech enhancement, to fix the scale mismatch problem which is usually occurred in real recording audio. We solve this problem by combining two additional restoration layers in the existing mask learning network. We also proposed a residual learning architecture for the speech enhancement, further improving the network generalization under different recording conditions. We evaluate the proposed speech enhancement models on CHiME 3 data. Without retraining the acoustic model, the best bi-direction LSTM with residue connections yields 25.13% relative WER reduction on real data and 34.03% WER on simulated data.
Then we propose a novel neural network based model called “deep clustering” for more general source separation tasks. We train a deep network to assign contrastive embedding vectors to each time-frequency region of the spectrogram in order to implicitly predict the segmentation labels of the target spectrogram from the input mixtures. This yields a deep network-based analogue to spectral clustering, in that the embeddings form a low-rank pairwise affinity matrix that approximates the ideal affinity matrix, while enabling much faster performance. At test time, the clustering step “decodes” the segmentation implicit in the embeddings by optimizing K-means with respect to the unknown assignments. Experiments on single channel mixtures from multiple speakers show that a speaker-independent model trained on two-speaker and three speakers mixtures can improve signal quality for mixtures of held-out speakers by an average over 10dB.
We then propose an extension for deep clustering named “deep attractor” network that allows the system to perform efficient end-to-end training. In the proposed model, attractor points for each source are firstly created the acoustic signals which pull together the time-frequency bins corresponding to each source by finding the centroids of the sources in the embedding space, which are subsequently used to determine the similarity of each bin in the mixture to each source. The network is then trained to minimize the reconstruction error of each source by optimizing the embeddings. We showed that this frame work can achieve even better results.
Lastly, we introduce two applications of the proposed models, in singing voice separation and the smart hearing aid device. For the former, a multi-task architecture is proposed, which combines the deep clustering and the classification based network. And a new state of the art separation result was achieved, where the signal to noise ratio was improved by 11.1dB on music and 7.9dB on singing voice. In the application of smart hearing aid device, we combine the neural decoding with the separation network. The system firstly decodes the user’s attention, which is further used to guide the separator for the targeting source. Both objective study and subjective study show the proposed system can accurately decode the attention and significantly improve the user experience
A survey on multi-output regression
In recent years, a plethora of approaches have been proposed to deal
with the increasingly challenging task of multi-output regression. This paper
provides a survey on state-of-the-art multi-output regression methods,
that are categorized as problem transformation and algorithm adaptation
methods. In addition, we present the mostly used performance evaluation
measures, publicly available data sets for multi-output regression
real-world problems, as well as open-source software frameworks
Attention is more than prediction precision [Commentary on target article]
A cornerstone of the target article is that, in a predictive coding framework, attention can be modelled by weighting prediction error with a measure of precision. We argue that this is not a complete explanation, especially in the light of ERP (event-related potentials) data showing large evoked responses for frequently presented target stimuli, which thus are predicted
Scalable Machine Learning Methods for Massive Biomedical Data Analysis.
Modern data acquisition techniques have enabled biomedical researchers to collect and analyze datasets of substantial size and complexity. The massive size of these datasets allows us to comprehensively study the biological system of interest at an unprecedented level of detail, which may lead to the discovery of clinically relevant biomarkers. Nonetheless, the dimensionality of these datasets presents critical computational and statistical challenges, as traditional statistical methods break down when the number of predictors dominates the number of observations, a setting frequently encountered in biomedical data analysis. This difficulty is compounded by the fact that biological data tend to be noisy and often possess complex correlation patterns among the predictors. The central goal of this dissertation is to develop a computationally tractable machine learning framework that allows us to extract scientifically meaningful information from these massive and highly complex biomedical datasets. We motivate the scope of our study by considering two important problems with clinical relevance: (1) uncertainty analysis for biomedical image registration, and (2) psychiatric disease prediction based on functional connectomes, which are high dimensional correlation maps generated from resting state functional MRI.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111354/1/takanori_1.pd
Digital Oculomotor Biomarkers in Dementia
Dementia is an umbrella term that covers a number of neurodegenerative syndromes featuring gradual disturbance of various cognitive functions that are severe enough to interfere with tasks of daily life. The diagnosis of dementia occurs frequently when pathological changes have been developing for years, symptoms of cognitive impairment are evident and the quality of life of the patients has already been deteriorated significantly. Although brain imaging and fluid biomarkers allow the monitoring of disease progression in vivo, they are expensive, invasive and not necessarily diagnostic in isolation. Recent studies suggest that eye-tracking technology is an innovative tool that holds promise for accelerating early detection of the disease, as well as, supporting the development of strategies that minimise impairment during every day activities. However, the optimal methods for quantitative evaluation of oculomotor behaviour during complex and naturalistic tasks in dementia have yet to be determined. This thesis investigates the development of computational tools and techniques to analyse eye movements of dementia patients and healthy controls under naturalistic and less constrained scenarios to identify novel digital oculomotor biomarkers. Three key contributions are made. First, the evaluation of the role of environment during navigation in patients with typical Alzheimer disease and Posterior Cortical Atrophy compared to a control group using a combination of eye movement and egocentric video analysis. Secondly, the development of a novel method of extracting salient features directly from the raw eye-tracking data of a mixed sample of dementia patients during a novel instruction-less cognitive test to detect oculomotor biomarkers of dementia-related cognitive dysfunction. Third, the application of unsupervised anomaly detection techniques for visualisation of oculomotor anomalies during various cognitive tasks. The work presented in this thesis furthers our understanding of dementia-related oculomotor dysfunction and gives future research direction for the development of computerised cognitive tests and ecological interventions
A Survey on Deep Learning in Medical Image Registration: New Technologies, Uncertainty, Evaluation Metrics, and Beyond
Over the past decade, deep learning technologies have greatly advanced the
field of medical image registration. The initial developments, such as
ResNet-based and U-Net-based networks, laid the groundwork for deep
learning-driven image registration. Subsequent progress has been made in
various aspects of deep learning-based registration, including similarity
measures, deformation regularizations, and uncertainty estimation. These
advancements have not only enriched the field of deformable image registration
but have also facilitated its application in a wide range of tasks, including
atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D
registration. In this paper, we present a comprehensive overview of the most
recent advancements in deep learning-based image registration. We begin with a
concise introduction to the core concepts of deep learning-based image
registration. Then, we delve into innovative network architectures, loss
functions specific to registration, and methods for estimating registration
uncertainty. Additionally, this paper explores appropriate evaluation metrics
for assessing the performance of deep learning models in registration tasks.
Finally, we highlight the practical applications of these novel techniques in
medical imaging and discuss the future prospects of deep learning-based image
registration
- …