Search CORE

49 research outputs found

Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models

Author: Anemüller Jörn
Nustede Eike J.
Publication venue
Publication date: 04/09/2023
Field of study

In this paper, we propose to extend the deep, complex U-Network architecture for speech enhancement by incorporating a probabilistic (i.e., variational) latent space model. The proposed model is evaluated against several ablated versions of itself in order to study the effects of the variational latent space model, complex-value processing, and self-attention. Evaluation on the MS-DNS 2020 and Voicebank+Demand datasets yields consistently high performance. E.g., the proposed model achieves an SI-SDR of up to 20.2 dB, about 0.5 to 1.4 dB higher than its ablated version without probabilistic latent space, 2-2.4 dB higher than WaveUNet, and 6.7 dB above PHASEN. Compared to real-valued magnitude spectrogram processing with a variational U-Net, the complex U-Net achieves an improvement of up to 4.5 dB SI-SDR. Complex spectrum encoding as magnitude and phase yields best performance in anechoic conditions whereas real and imaginary part representation results in better generalization to (novel) reverberation conditions, possibly due to the underlying physics of sound

arXiv.org e-Print Archive

Complex Independent Component Analysis of Frequency-Domain Electroencephalographic Data

Author: Amari
Anemüller
Arieli
Bell
Berger
Bloomfield
Courchesne
Enghoff
Freeman
Hyvärinen
Jung
Jörn Anemüller
Kuhn
Lopez da Silva
Makeig
Makeig
Makeig
Makeig
Makeig
Nikias
Scott Makeig
Stein
Terrence J. Sejnowski
Torkkola
Zibulevsky
Publication venue: 'Elsevier BV'
Publication date: 01/01/2003
Field of study

Independent component analysis (ICA) has proven useful for modeling brain and electroencephalographic (EEG) data. Here, we present a new, generalized method to better capture the dynamics of brain signals than previous ICA algorithms. We regard EEG sources as eliciting spatio-temporal activity patterns, corresponding to, e.g., trajectories of activation propagating across cortex. This leads to a model of convolutive signal superposition, in contrast with the commonly used instantaneous mixing model. In the frequency-domain, convolutive mixing is equivalent to multiplicative mixing of complex signal sources within distinct spectral bands. We decompose the recorded spectral-domain signals into independent components by a complex infomax ICA algorithm. First results from a visual attention EEG experiment exhibit (1) sources of spatio-temporal dynamics in the data, (2) links to subject behavior, (3) sources with a limited spectral extent, and (4) a higher degree of independence compared to sources derived by standard ICA.Comment: 21 pages, 11 figures. Added final journal reference, fixed minor typo

arXiv.org e-Print Archive

CiteSeerX

Crossref

Model structure selection in convolutive mixtures

Author: A. Hyvarinen
A.J. Bell
B.A. Pearlmutter
G. Schwarz
H. Attias
J. Anemüller
M. Dyrholm
S. Makeig
Publication venue
Publication date: 01/01/2006
Field of study

Crossref

Online Research Database In Technology

Neurophysiologic Markers of Abnormal Brain Activity in Schizophrenia

Author: A Delorme
Anthony J. Rissling
B Turetsky
B Turetsky
C Escera
D Braff
D Friedman
D Hermens
D Mathalon
D Umbricht
D Umbricht
David L. Braff
DD Stettler
DE Broadbent
E Callaway
G Light
G Light
G Light
Gregory A. Light
H Tiitinen
J Anemüller
J Horváth
J Onton
J Wynn
M Cortiñas
M Green
M Kiang
M Kiang
M Kisley
R Cooper
R Ehrlichman
R Ehrlichman
R Näätänen
R Näätänen
R Näätänen
R Näätänen
R Näätänen
S Makeig
S Makeig
S Makeig
S Makeig
S Makeig
S Makeig
S Siegel
Scott Makeig
TW Picton
Y Kawakubo
Y Kawakubo
Publication venue: Current Science Inc.
Publication date: 01/01/2010
Field of study

Cortical electrophysiologic event-related potentials are multidimensional measures of information processing that are well-suited for efficiently parsing automatic and controlled components of cognition that span the range of deficits evidenced in schizophrenia patients. These information processes are key cognitive measures that are recognized as informative and valid targets for understanding the neurobiology of schizophrenia. These measures may be used in concert with the Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) neurocognitive measures in the development of novel treatments for schizophrenia and related neuropsychiatric disorders. The employment of novel event-related potential paradigms designed to carefully characterize the early spectrum of perceptual and cognitive information processing allows investigators to identify the neurophysiologic basis of cognitive dysfunction in schizophrenia and to examine the associated clinical and functional impairments

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Spatio-temporal dynamics in fMRI recordings revealed with complex independent component analysis

Author: Amari
Anemüller
Anemüller
Anemüller
Bell
Bingham
Buchner
Calhound
Cardoso
Cardoso
Duann
Fiori
Jeng-Ren Duann
Jörn Anemüller
McKeown
McKeown
Parra
Sawada
Scott Makeig
Stone
Terrence J. Sejnowski
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Classifier architectures for acoustic scenes and events : implications for DNNs, TDNNs, and perceptual features from DCASE 2016

Author: Anemüller J.
Goetze S.
Kollmeier B.
Moritz N.
Schröder J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2017
Field of study

This paper evaluates neural network (NN) based systems and compares them to Gaussian mixture model (GMM) and hidden Markov model (HMM) approaches for acoustic scene classification (SC) and polyphonic acoustic event detection (AED) that are applied to data of the “Detection and Classification of Acoustic Scenes and Events 2016” (DCASE'16) challenge, task 1 and task 3, respectively. For both tasks, the use of deep neural networks (DNNs) and features based on an amplitude modulation filterbank and a Gabor filterbank (GFB) are evaluated and compared to standard approaches. For SC, additionally a time-delay NN approach is proposed that enables analysis of long contextual information similar to recurrent NNs but with training efforts comparable to conventional DNNs. The SC system proposed for task 1 of the DCASE'16 challenge attains a recognition accuracy of 77.5%, which is 5.6% higher compared to the DCASE'16 baseline system. For the AED task, DNNs are adopted in tandem and hybrid approaches, i.e., as part of HMM-based systems. These systems are evaluated for the polyphonic data of task 3 from the DCASE'16 challenge. Several strategies to address the issue of polyphony are considered. It is shown that DNN-based systems perform less accurate than the traditional systems for this task. Best results are achieved using GFB features in combination with a multiclass GMM-HMM back end

Crossref

White Rose Research Online