Search CORE

57 research outputs found

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Author: Gerkmann Timo
Lemercier Jean-Marie
Richter Julius
Welker Simon
Publication venue
Publication date: 12/03/2024
Field of study

Diffusion models have shown a great ability at bridging the performance gap between predictive and generative approaches for speech enhancement. We have shown that they may even outperform their predictive counterparts for non-additive corruption types or when they are evaluated on mismatched conditions. However, diffusion models suffer from a high computational burden, mainly as they require to run a neural network for each reverse diffusion step, whereas predictive approaches only require one pass. As diffusion models are generative approaches they may also produce vocalizing and breathing artifacts in adverse conditions. In comparison, in such difficult scenarios, predictive models typically do not produce such artifacts but tend to distort the target speech instead, thereby degrading the speech quality. In this work, we present a stochastic regeneration approach where an estimate given by a predictive model is provided as a guide for further diffusion. We show that the proposed approach uses the predictive model to remove the vocalizing and breathing artifacts while producing very high quality samples thanks to the diffusion model, even in adverse conditions. We further show that this approach enables to use lighter sampling schemes with fewer diffusion steps without sacrificing quality, thus lifting the computational burden by an order of magnitude. Source code and audio examples are available online (https://uhh.de/inf-sp-storm).Comment: Published in IEEE/ACM Transactions on Audio, Speech and Language Processing, 202

arXiv.org e-Print Archive

Customizable End-to-end Optimization of Online Neural Network-supported Dereverberation for Hearing Devices

Author: Gerkmann Timo
Koning Raphael
Lemercier Jean-Marie
Thiemann Joachim
Publication venue
Publication date: 06/04/2022
Field of study

This work focuses on online dereverberation for hearing devices using the weighted prediction error (WPE) algorithm. WPE filtering requires an estimate of the target speech power spectral density (PSD). Recently deep neural networks (DNNs) have been used for this task. However, these approaches optimize the PSD estimate which only indirectly affects the WPE output, thus potentially resulting in limited dereverberation. In this paper, we propose an end-to-end approach specialized for online processing, that directly optimizes the dereverberated output signal. In addition, we propose to adapt it to the needs of different types of hearing-device users by modifying the optimization target as well as the WPE algorithm characteristics used in training. We show that the proposed end-to-end approach outperforms the traditional and conventional DNN-supported WPEs on a noise-free version of the WHAMR! dataset.Comment: \copyright 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

arXiv.org e-Print Archive

Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration

Author: Gerkmann Timo
Lemercier Jean-Marie
Richter Julius
Welker Simon
Publication venue
Publication date: 04/11/2022
Field of study

Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech enhancement and dereverberation. While discriminative models have traditionally been argued to be more powerful e.g. for speech enhancement, generative diffusion approaches have recently been shown to narrow this performance gap considerably. In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks. For this, we extend our prior contributions on diffusion-based speech enhancement in the complex time-frequency domain to the task of bandwith extension. We then compare it to a discriminatively trained neural network with the same network architecture on three restoration tasks, namely speech denoising, dereverberation and bandwidth extension. We observe that the generative approach performs globally better than its discriminative counterpart on all tasks, with the strongest benefit for non-additive distortion models, like in dereverberation and bandwidth extension. Code and audio examples can be found online at https://uhh.de/inf-sp-sgmsemultitaskComment: Submitted to ICASSP 202

arXiv.org e-Print Archive

Single and Few-step Diffusion for Generative Speech Enhancement

Author: Gerkmann Timo
Lay Bunlong
Lemercier Jean-Marie
Richter Julius
Publication venue
Publication date: 15/01/2024
Field of study

Diffusion models have shown promising results in speech enhancement, using a task-adapted diffusion process for the conditional generation of clean speech given a noisy mixture. However, at test time, the neural network used for score estimation is called multiple times to solve the iterative reverse process. This results in a slow inference process and causes discretization errors that accumulate over the sampling trajectory. In this paper, we address these limitations through a two-stage training approach. In the first stage, we train the diffusion model the usual way using the generative denoising score matching loss. In the second stage, we compute the enhanced signal by solving the reverse process and compare the resulting estimate to the clean speech target using a predictive loss. We show that using this second training stage enables achieving the same performance as the baseline model using only 5 function evaluations instead of 60 function evaluations. While the performance of usual generative diffusion algorithms drops dramatically when lowering the number of function evaluations (NFEs) to obtain single-step diffusion, we show that our proposed method keeps a steady performance and therefore largely outperforms the diffusion baseline in this setting and also generalizes better than its predictive counterpart.Comment: copyright 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

arXiv.org e-Print Archive

Wind Noise Reduction with a Diffusion-based Stochastic Regeneration Model

Author: Gerkmann Timo
Koning Raphael
Lemercier Jean-Marie
Thiemann Joachim
Publication venue
Publication date: 22/06/2023
Field of study

In this paper we present a method for single-channel wind noise reduction using our previously proposed diffusion-based stochastic regeneration model combining predictive and generative modelling. We introduce a non-additive speech in noise model to account for the non-linear deformation of the membrane caused by the wind flow and possible clipping. We show that our stochastic regeneration model outperforms other neural-network-based wind noise reduction methods as well as purely predictive and generative models, on a dataset using simulated and real-recorded wind noise. We further show that the proposed method generalizes well by testing on an unseen dataset with real-recorded wind noise. Audio samples, data generation scripts and code for the proposed methods can be found online (https://uhh.de/inf-sp-storm-wind).Comment: Submitted to VDE 15th ITG conference on Speech Communicatio

arXiv.org e-Print Archive

BUDDy: Single-Channel Blind Unsupervised Dereverberation with Diffusion Models

Author: Gerkmann Timo
Lemercier Jean-Marie
Moliner Eloi
Välimäki Vesa
Welker Simon
Publication venue
Publication date: 07/05/2024
Field of study

In this paper, we present an unsupervised single-channel method for joint blind dereverberation and room impulse response estimation, based on posterior sampling with diffusion models. We parameterize the reverberation operator using a filter with exponential decay for each frequency subband, and iteratively estimate the corresponding parameters as the speech utterance gets refined along the reverse diffusion trajectory. A measurement consistency criterion enforces the fidelity of the generated speech with the reverberant measurement, while an unconditional diffusion model implements a strong prior for clean speech generation. Without any knowledge of the room impulse response nor any coupled reverberant-anechoic data, we can successfully perform dereverberation in various acoustic scenarios. Our method significantly outperforms previous blind unsupervised baselines, and we demonstrate its increased robustness to unseen acoustic conditions in comparison to blind supervised methods. Audio samples and code are available online.Comment: Submitted to IWAENC 202

arXiv.org e-Print Archive

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

Author: Gerkmann Timo
Lay Bunlong
Lemercier Jean-Marie
Richter Julius
Welker Simon
Publication venue
Publication date: 13/06/2023
Field of study

In this work, we build upon our previous publication and use diffusion-based generative models for speech enhancement. We present a detailed overview of the diffusion process that is based on a stochastic differential equation and delve into an extensive theoretical examination of its implications. Opposed to usual conditional generation tasks, we do not start the reverse process from pure Gaussian noise but from a mixture of noisy speech and Gaussian noise. This matches our forward process which moves from clean speech to noisy speech by including a drift term. We show that this procedure enables using only 30 diffusion steps to generate high-quality clean speech estimates. By adapting the network architecture, we are able to significantly improve the speech enhancement performance, indicating that the network, rather than the formalism, was the main limitation of our original approach. In an extensive cross-dataset evaluation, we show that the improved method can compete with recent discriminative models and achieves better generalization when evaluating on a different corpus than used for training. We complement the results with an instrumental evaluation using real-world noisy recordings and a listening experiment, in which our proposed method is rated best. Examining different sampler configurations for solving the reverse process allows us to balance the performance and computational speed of the proposed method. Moreover, we show that the proposed method is also suitable for dereverberation and thus not limited to additive background noise removal. Code and audio examples are available online, see https://github.com/sp-uhh/sgmseComment: Accepted versio

arXiv.org e-Print Archive

The SIB Swiss Institute of Bioinformatics' resources: focus on curated databases

Author: Agote Asier Ullate
Aguilar Rodriguez Jose
Ahrens Christian H
Ahrne Erik Lennart
Ai Ni
Aimo Lucila
Akalin Altuna
Aleksiev Tyanko
Alocci Davide
Altenhoff Adrian
Altimiras Emma Ricart
Alves Isabel
Ambrosini Giovanna
Angelina Paolo
Anisimova Maria
Appel Ron
Argoud-Puy Ghislaine
Arnold Konstantin
Arpat Bulak
Artimo Panu
Ascencao Kelly
Auchincloss Andrea
Axelsen Kristian
Bairoch Amos
Baratin Delphine
Barbato Alessandro
Barbie Valerie
Barisal Parit
Barras David
Barreiro Maria
Barret Sophie
Bastian Frederic
Batista Neto Teresa Manuela
Baudis Michael
Beaudoing Emmanuel
Beckmann Jacques S.
Bekkar Amel Kawter
Benmohammed Sara
Bernard Madeleine
Bertelli Claire
Bertoni Martino
Bienert Stefan
Bignucolo Olivier
Bilbao Aivett
Bilican Adem
Blank Diana
Blatter Marie Claude
Blum Lorenz
Bocquet Jocelyne
Boeckmann Brigitte
Bolleman Jerven Tjalling
Bordoli Lorenza
Bosshard Lars
Boucher Gerard
Bougueleret Lydie
Boutet Emmanuel
Bovigny Christophe
Bratulic Sinisa
Breuza Lionel
Bridge Alan James
Britan Aurore
Brito Francisco
Bruggmann Remy
Bucher Philipp
Bultet Lisandra Aguilar
Burdet Frederic
Burger Lukas
Cabello Elena Maria
Calderon Sandra
Cammoun Leila Ben Hamida
Cannarozzi Gina
Caria Vanessa Monteiro
Carl Sarah
Casas Cristina Casals
Catherinet Sebastien
Charpilloz Christophe
Chaskar Prasad Datatray
Chen Weihua
Chopard Bastion
Chu Hoi Yee
Civic Natacha
Claassen Manfred
Clottu Sylvie
Colombo Martino
Cosandier Isabelle
Coudert Elisabeth
Crespo Isaac
Creus Marc
Cuche Beatrice
Cuendet Michel A
Cusin Isabelle
Daga Neha
Daina Antoine
Dauvillier Jerome
David Fabrice
Davydov Iakov
De Beer Tjaart
De Castro Edouard
De Laval Valentine Rech
De Santana Charles
Delafontaine Julien
Delorenzi Mauro
Delucinge Vivier Celine
Demirel Oemer
Derham Robert
Dermitzakis Emmanouil Manolis
Dib Linda
Diene Seydina
Dilek Nahzli
Dilmi Julian
Domagalski Marcin Jakub
Dorier Julien
Dornevil Dolnide
Dousse Aline
Dreos Rene
Duchen Pablo
Duperret Isabelle Dupanloup
Durinx Christine
Duvaud Severine
Engler Robin
Excoffier Laurent
Fabbretti Roberto
Falcone Jean-Luc
Falquet Laurent
Famiglietti Maria Livia
Ferreira Anne-Maud
Ferreira Mariana De Sa Ricca Manadelo
Feuermann Marc
Filliettaz Marc
Fischer Heidi
Foucal Adrien
Franceschini Andrea
Frazao Josias Brito
Frkek Scrap
Fstreicher Anne
Fucile Geoffrey
Gaidatzis Dimos
Garcia Victor
Gardiol Daniel Federico Hernandez
Gasteiger Elisabeth
Gateau Alain
Gatti Lorenzo
Gaudet Pascale
Gaudinat Arnaud
Gehant Sebastien
Gerritsen Vivienne Baillie
Getaz Michael
Gfeller David
Gharib Walid H.
Ghraichy Marie
Gidoin Cindy
Gil Manuel
Gleizes Anne
Gobeill Julien
Gomez Ruben Martin Cabezon
Gonnet Gaston
Gos Arnaud
Gotz Lou
Gouy Alexandre
Grbic Djordje
Grognuz Oksana Riba
Groux Romain
Gruaz Gumowski Nadine
Grun Delphine
Gschwind Andreas
Guex Nicolas
Gupta Saumya
Haake Dennis
Haas Juergen
Hatzimanikatis Vassily
Heckel Gerald
Hegel Volker
Hinard Valerie
Hinz Ursula
Homicsko Krisztian
Horlacher Oliver
Hosseini Sayed-Rzgar
Hotz Hans-Rudolf
Hulo Chantal
Hundsrucker Christian
Ibberson Mark
Ilmjarv Sten
Ioannidis Panagiotis
Ioannidis Vassilios
Iseli Christian
Ivanek Robert
Iwaszkiewicz Justyna
Jacquet Philippe
Jacquot Martin
Jagannathan Vidhya
Jan Maxime
Jensen Jeffrey
Johansson Maria U.
Johner Niklaus
Jungo Florence
Junier Thomas
Kahraman Abdullah
Katsantoni Maria
Keller Guillaume
Kerhornou Arnaud
Khalid Fahad
Kimljenovic Andrea
Klingbiel Dirk
Kriventseva Evgenia
Kryuchkova Nadezda
Kumar Sunil
Kutalik Zoltan
Kuznetsov Dmitry
Kuzyakiv Rostyslav
Lane Lydie
Lara Vicente
Ledesma Leonardo
Leleu Marion
Lemercier Philippe
Lenoir Muriel Metrailler
Lew Daniel
Lieberherr Damien
Liechti Robin
Lisacek Frederique
Litsios Glenn
Liu Jialin
Lombardot Thierry
Lopez Pablo Escobar
Mace Aurelien
Maffioletti Sergio
Mahi Mohamed-Ali
Maiolo Massimo
Majjigapu Somi Reddy
Malmstrom Lars
Mangold Veronique
Marek Diana
Mariethoz Julien
Marin Ray
Martin Olivier
Martin Xavier
Martin-Campos Trinidad
Mary Camille
Masclaux Frederic
Masson Patrick
Meier Cecile
Messina Antonio
Meyer Xavier
Michel Pierre-Andre
Michielin Olivier
Milanese Alessi
Missiaglia Edoardo
Moret Philippe
Moretti Sebastien
Morgat Anne
Mottaz Anais
Mottin Luc
Mouscaz Yoann
Mueller Markus
Murri Riccardo
Mylonas Roman
Neuenschwander Samuel
Nikitin Frederic
Niknejad Anne
Nouspikel Nevila
Nso Lydie Nso
Okoniewski Michal
Omasits Ulrich
Paccaud Benjamin
Pachkov Mikhail
Paesano Salvo Giacomo
Pagni Marco
Palagi Patricia M
Pasche Emilie
Payne Joshua L
Pedone Pascale Anderle
Pedruzzi Ivo
Peischl Stephan
Peitsch Manuel
Pepe Anush Chiappino
Perez Jorge Molina
Perier Rouayda Cavin
Perlini Sabine
Pilbout Sandrine
Podvinec Michael
Pohlmann Rainer
Polizzi Davide
Potter Douglas
Poux Sylvain
Pozzato Monica
Pradervand Sylvain
Praz Viviane
Pruess Manuela
Pujadas Eva
Racle Julien
Raschi Marcelo
Ratib Osman
Rausell Antonio
Redaschi Nicole
Rempfer Christine
Ren Guangpeng
Rib Leonor
Rivoire Catherine
Robin Thibault
Robinson Rechavi Marc
Rodrigues Joao
Roechert Bernd
Roehrig Ute F
Roelli Patrick
Roggli Paula Duck
Romano Valentina
Rossier Gregoire
Roth Alexander
Rougemont Jacques
Roux Julien
Royo Helene
Ruch Patrick
Rueeger Sina
Ruinelli Michela
Rustom Mohamad
Salamin Nicolas
Sankar Martial
Sarkar Namrata
Sates Abdul
Saxenhofer Moritz
Schaeffer Mathieu
Schaerli Yolanda
Schaper Eike
Schmid Annette
Schmid Christoph
Schmid Emanuel
Schmid Michael
Schmidt Sebastian
Schmocker Daniel
Schneider Michel
Schuepbach Thierry
Schuetz Frederic
Schwede Torsten
Sengstag Thierry
Serrano Martha
Sethi Atul
Shahmirzadi Omid
Sigrist Christian
Silvestro Daniele
Simao Neto Felipe Aristides
Simillion Cedric
Simonovic Milan
Skunca Nives
Sluzek Kasia
Smith Adam Alexander Thil
Soneson Charlotte
Sprouffske Kathleen
Stadler Michael
Staehli Sylvie
Stevenson Brian
Stockinger Heinz
Straszewski Jakub
Stricker Thomas
Studer Gabriel
Stutz Andre
Suffiotti Madeleine
Sundaram Shyamala
Szklarczyk Damian
Szovenyi Peter
Tegenfeldt Fredrik
Teixeira Daniel
Tellenbach Susanne
Thuong Van Du Tran
Tognolli Michael
Topolsky Ivan
Tsantoulis Petros
Tzika Athanasia C.
Van Nimwegen Erik
Vandati Reza Ali Rezaee
Varadarajan Adithi
Veranneman Maren
Verbregue Lance
Veuthey Anne-Lise
Vishnyakova Dina
Von Mering Christian
Vyas Rounak
Wagner Andreas
Walther Daniel
Wan Hon Wai
Wang Mingcong
Waterhouse Andrew
Waterhouse Robert
Wicki Adrian
Wigger Leonore
Wirapati Pratyaksha
Witschi Ursula
Wuethrich Daniel
Wyder Stefan
Wyler Kurt
Xenarios Ioannis
Yamada Kana
Yan Zheng
Yasrebi Haleh
Zahn Monique
Zangger Nadine
Zdobnov Evgeny
Zerzion Daniel
Zoete Vincent
Zoller Stefan
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/04/2016
Field of study

The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB's Bioinformatics resource portal ExPASy features over 150 resources, including UniProtKB/Swiss-Prot, ENZYME, PROSITE, neXtProt, STRING, UniCarbKB, SugarBindDB, SwissRegulon, EPD, arrayMap, Bgee, SWISS-MODEL Repository, OMA, OrthoDB and other databases, which are briefly described in this article

Infoscience - École polytechnique fédérale de Lausanne

Text format and the enactment effect in an assembly task.

Author: Cellier Jean-Marie
Lemercier Céline
Mojahid M.
Terrier Patrice
Publication venue: International Congress of Psychology
Publication date: 01/01/2004
Field of study

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

A neural network-supported two-stage algorithm for lightweight dereverberation on hearing devices

Author: Jean-Marie Lemercier
Joachim Thiemann
Raphael Koning
Timo Gerkmann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2023
Field of study

Abstract A two-stage lightweight online dereverberation algorithm for hearing devices is presented in this paper. The approach combines a multi-channel multi-frame linear filter with a single-channel single-frame post-filter. Both components rely on power spectral density (PSD) estimates provided by deep neural networks (DNNs). By deriving new metrics analyzing the dereverberation performance in various time ranges, we confirm that directly optimizing for a criterion at the output of the multi-channel linear filtering stage results in a more efficient dereverberation as compared to placing the criterion at the output of the DNN to optimize the PSD estimation. More concretely, we show that training this stage end-to-end helps further remove the reverberation in the range accessible to the filter, thus increasing the early-to-moderate reverberation ratio. We argue and demonstrate that it can then be well combined with a post-filtering stage to efficiently suppress the residual late reverberation, thereby increasing the early-to-final reverberation ratio. This proposed two-stage procedure is shown to be both very effective in terms of dereverberation performance and computational demands, as compared to, e.g., recent state-of-the-art DNN approaches. Furthermore, the proposed two-stage system can be adapted to the needs of different types of hearing-device users by controlling the amount of reduction of early reflections

Directory of Open Access Journals