Search CORE

339 research outputs found

Voice pathology detection using interlaced derivative pattern on glottal source excitation

Author: Al-nasheri Ahmed
Ali Zulfiqar
Alsulaiman Mansour
Bencherif Mohamed A.
Farahat Mohamed
Malki Khalid H.
Mesallam Tamer A.
Muhammad Ghulam
Publication venue: 'Elsevier BV'
Publication date: 31/01/2017
Field of study

Digital Signal Processing Research Program

Author: Baggeroer Arthur B.
Beckmann Paul E.
Bhatta Saurav Dev
Buck John R.
Cuomo Kevin M.
Isabelle Steven H.
Jachner Jacek
Lam Warren M.
Musicus Bruce R.
Njeru James M.
Oppenheim Alan V.
Papadopoulos Haralabos C.
Preisig James C.
Rajan S. D.
Richard Michael D.
Scherock Stephen F.
Singer Andrew C.
Spiesberger John L.
Weinstein Ehud
Wornell Gregory W.
Zangi Kambiz C.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date
Field of study

Contains table of contents for Section 2, an introduction and reports on seventeen research projects.U.S. Navy - Office of Naval Research Grant N00014-91-J-1628Vertical Arrays for the Heard Island Experiment Award No. SC 48548Charles S. Draper Laboratories, Inc. Contract DL-H-418472Defense Advanced Research Projects Agency/U.S. Navy - Office of Naval Research Grant N00014-89-J-1489Rockwell Corporation Doctoral FellowshipMIT - Woods Hole Oceanographic Institution Joint ProgramDefense Advanced Research Projects Agency/U.S. Navy - Office of Naval Research Grant N00014-90-J-1109Lockheed Sanders, Inc./U.S. Navy - Office of Naval Research Contract N00014-91-C-0125U.S. Air Force - Office of Scientific Research Grant AFOSR-91-0034AT&T Laboratories Doctoral ProgramU.S. Navy - Office of Naval Research Grant N00014-91-J-1628General Electric Foundation Graduate Fellowship in Electrical EngineeringNational Science Foundation Grant MIP 87-14969National Science Foundation Graduate FellowshipCanada Natural Sciences and Engineering Research CouncilLockheed Sanders, Inc

DSpace@MIT

Single channel overlapped-speech detection and separation of spontaneous conversations

Author: Kadhim Hasan Mohammad-Ali
Publication venue
Publication date: 01/01/2018
Field of study

PhD ThesisIn the thesis, spontaneous conversation containing both speech mixture and speech dialogue is considered. The speech mixture refers to speakers speaking simultaneously (i.e. the overlapped-speech). The speech dialogue refers to only one speaker is actively speaking and the other is silent. That Input conversation is firstly processed by the overlapped-speech detection. Two output signals are then segregated into dialogue and mixture formats. The dialogue is processed by speaker diarization. Its outputs are the individual speech of each speaker. The mixture is processed by speech separation. Its outputs are independent separated speech signals of the speaker. When the separation input contains only the mixture, blind speech separation approach is used. When the separation is assisted by the outputs of the speaker diarization, it is informed speech separation. The research presents novel: overlapped-speech detection algorithm, and two speech separation algorithms. The proposed overlapped-speech detection is an algorithm to estimate the switching instants of the input. Optimization loop is adapted to adopt the best capsulated audio features and to avoid the worst. The optimization depends on principles of the pattern recognition, and k-means clustering. For of 300 simulated conversations, averages of: False-Alarm Error is 1.9%, Missed-Speech Error is 0.4%, and Overlap-Speaker Error is 1%. Approximately, these errors equal the errors of best recent reliable speaker diarization corpuses. The proposed blind speech separation algorithm consists of four sequential techniques: filter-bank analysis, Non-negative Matrix Factorization (NMF), speaker clustering and filter-bank synthesis. Instead of the required speaker segmentation, effective standard framing is contributed. Average obtained objective tests (SAR, SDR and SIR) of 51 simulated conversations are: 5.06dB, 4.87dB and 12.47dB respectively. For the proposed informed speech separation algorithm, outputs of the speaker diarization are a generated-database. The database associated the speech separation by creating virtual targeted-speech and mixture. The contributed virtual signals are trained to facilitate the separation by homogenising them with the NMF-matrix elements of the real mixture. Contributed masking optimized the resulting speech. Average obtained SAR, SDR and SIR of 341 simulated conversations are 9.55dB, 1.12dB, and 2.97dB respectively. Per the objective tests of the two speech separation algorithms, they are in the mid-range of the well-known NMF-based audio and speech separation methods

Newcastle University eTheses

Models and analysis of vocal emissions for biomedical applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

Directory of Open Access Books (DOAB)