Search CORE

25 research outputs found

Intelligibility model optimisation approaches for speech pre-enhancement

Author: Al Dabel Maryam
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 13/12/2016
Field of study

The goal of improving the intelligibility of broadcast speech is being met by a recent new direction in speech enhancement: near-end intelligibility enhancement. In contrast to the conventional speech enhancement approach that processes the corrupted speech at the receiver-side of the communication chain, the near-end intelligibility enhancement approach pre-processes the clean speech at the transmitter-side, i.e. before it is played into the environmental noise. In this work, we describe an optimisation-based approach to near-end intelligibility enhancement using models of speech intelligibility to improve the intelligibility of speech in noise. This thesis first presents a survey of speech intelligibility models and how the adverse acoustic conditions affect the intelligibility of speech. The purpose of this survey is to identify models that we can adopt in the design of the pre-enhancement system. Then, we investigate the strategies humans use to increase speech intelligibility in noise. We then relate human strategies to existing algorithms for near-end intelligibility enhancement. A closed-loop feedback approach to near-end intelligibility enhancement is then introduced. In this framework, speech modifications are guided by a model of intelligibility. For the closed-loop system to work, we develop a simple spectral modification strategy that modifies the first few coefficients of an auditory cepstral representation such as to maximise an intelligibility measure. We experiment with two contrasting measures of objective intelligibility. The first, as a baseline, is an audibility measure named 'glimpse proportion' that is computed as the proportion of the spectro-temporal representation of the speech signal that is free from masking. We then propose a discriminative intelligibility model, building on the principles of missing data speech recognition, to model the likelihood of specific phonetic confusions that may occur when speech is presented in noise. The discriminative intelligibility measure is computed using a statistical model of speech from the speaker that is to be enhanced. Interim results showed that, unlike the glimpse proportion based system, the discriminative based system did not improve intelligibility. We investigated the reason behind that and we found that the discriminative based system was not able to target the phonetic confusion with the fixed spectral shaping. To address that, we introduce a time-varying spectral modification. We also propose to perform the optimisation on a segment-by-segment basis which enables a robust solution against the fluctuating noise. We further combine our system with a noise-independent enhancement technique, i.e. dynamic range compression. We found significant improvement in non-stationary noise condition, but no significant differences to the state-of-the art system (spectral shaping and dynamic range compression) where found in stationary noise condition

White Rose E-theses Online

Detecting autism, emotions and social signals using AdaBoost

Author: Busa-Fekete Róbert
Gosztolya Gábor
Tóth László
Publication venue: Interspeech
Publication date: 01/01/2013
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Transducer models in the ultrasound simulation program FIELD II and their accuracy

Author: Bæk David
Jensen Jørgen Arendt
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2010
Field of study

Crossref

Online Research Database In Technology

Simultaneous measurements of room-acoustic parameters using different measuring equipment?

Author: Gade Anders Christian
Halmrast Tor
Winsvold Bjorn
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1998
Field of study

Crossref

Online Research Database In Technology

Compromises in orchestra pit design: A ten-year trench war in The Royal Theatre, Copenhagen

Author: Gade Anders Christian
Mortensen Bo
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1998
Field of study

Crossref

Online Research Database In Technology

Historical Acoustics: Relationships between People and Sound over Time

Author: Aletta Francesco
Kang Jian
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

This book is a collection of contributions to the Special Issue “Historical Acoustics: Relationships between People and Sound over Time”. The research presented here aims to explore the origins of acoustics and examine the relationships that have evolved over the centuries between people and auditory phenomena. Sounds have indeed accompanied human civilizations since the beginning of time, helping them to make sense of the world and to shape their cultures. Several key topics emerged, such as the acoustics of historical worship buildings, the acoustics of sites of archaeological interest, the acoustics of historical opera houses, and the topic of soundscapes as cultural intangible heritage. The book, as a whole, reflects the vibrant research activity around the “acoustics of the past”, which will hopefully be serve as a foundation for inspiring the future path of this discipline

UCL Discovery

Directory of Open Access Books (DOAB)

Design and Optimisation Of Voice Alarm Systems for Underground Stations

Author: Gomez-Agustina L
Gomez-Agustina L
Publication venue: London South Bank University
Publication date: 01/01/2012
Field of study

Voice Alarm systems (VA) are an essential part of subsurface underground station emergency and evacuation systems. Their main purpose is to assist in the management of emergency situations and evacuation procedures by providing key verbal instructions to the occupants. However these life-critical systems will be ineffective if the messages broadcast are unintelligible. Unfortunately, in most London underground subsurface areas the announcements broadcast by the VA system are not adequately intelligible and often do not reach a minimum specified performance target. The performance of VA relating to its electro-acoustic characteristics is relatively complex and depends on multiple interrelated factors and operational constraints . Underground stations present complex geometrical and architectural features which severely challenge the achievement of satisfactory performance. Despite the importance of VA system, there are few works in the literature providing practical and applicable design knowledge in the context of real world underground spaces. Moreover contractual performance requirements are not suitably laid out and this can lead to ineffective designs. This research aims to provide practical design knowledge and understanding for the improvement of VA speech intelligibility performance in underground spaces. Research results were derived from measurements and designs undertaken for real scenarios. A specific knowledge base is provided on the acoustics of underground spaces, speech intelligibility and VA systems. A critical review of relevant research and performance specifications and standards is undertaken and a new performance design parameter is proposed. An empirical prediction model tool based on a large pool of measured survey data is developed for the prediction of the Speech Transmission Index of VA on platforms. A validating and comparative study is undertaken for two widely used commercial acoustic simulation programs to assess their suitability as design tools for VA systems on platforms, CATT-Acoustic and Odeon. The impact on VA performance of design variables are investigated using a computer simulation of a representative platform. A novel acoustic treatment design concept is proposed. The Yang quasi diffuse sound field theory for platforms is verified and derived knowledge expanded. Practical design recommendations are provided as well as suggestion for further work

LSBU Research Open