Search CORE

1,770 research outputs found

Performance Analysis of Advanced Front Ends on the Aurora Large Vocabulary Evaluation

Author: Parihar Naveen
Publication venue: Scholars Junction
Publication date: 10/11/2003
Field of study

Over the past few years, speech recognition technology performance on tasks ranging from isolated digit recognition to conversational speech has dramatically improved. Performance on limited recognition tasks in noiseree environments is comparable to that achieved by human transcribers. This advancement in automatic speech recognition technology along with an increase in the compute power of mobile devices, standardization of communication protocols, and the explosion in the popularity of the mobile devices, has created an interest in flexible voice interfaces for mobile devices. However, speech recognition performance degrades dramatically in mobile environments which are inherently noisy. In the recent past, a great amount of effort has been spent on the development of front ends based on advanced noise robust approaches. The primary objective of this thesis was to analyze the performance of two advanced front ends, referred to as the QIO and MFA front ends, on a speech recognition task based on the Wall Street Journal database. Though the advanced front ends are shown to achieve a significant improvement over an industry-standard baseline front end, this improvement is not operationally significant. Further, we show that the results of this evaluation were not significantly impacted by suboptimal recognition system parameter settings. Without any front end-specific tuning, the MFA front end outperforms the QIO front end by 9.6% relative. With tuning, the relative performance gap increases to 15.8%. Finally, we also show that mismatched microphone and additive noise evaluation conditions resulted in a significant degradation in performance for both front ends

Mississippi State University Libraries ETD database

Scholars Junction - Mississippi State University Institutional Repository

Exploration and Optimization of Noise Reduction Algorithms for Speech Recognition in Embedded Devices

Author: Setiawan Panji
Publication venue: Universität der Bundeswehr München, Fakultät für Elektrotechnik und Informationstechnik
Publication date: 01/01/2009
Field of study

Environmental noise present in real-life applications substantially degrades the performance of speech recognition systems. An example is an in-car scenario where a speech recognition system has to support the man-machine interface. Several sources of noise coming from the engine, wipers, wheels etc., interact with speech. Special challenge is given in an open window scenario, where noise of traffic, park noise, etc., has to be regarded. The main goal of this thesis is to improve the performance of a speech recognition system based on a state-of-the-art hidden Markov model (HMM) using noise reduction methods. The performance is measured with respect to word error rate and with the method of mutual information. The noise reduction methods are based on weighting rules. Least-squares weighting rules in the frequency domain have been developed to enable a continuous development based on the existing system and also to guarantee its low complexity and footprint for applications in embedded devices. The weighting rule parameters are optimized employing a multidimensional optimization task method of Monte Carlo followed by a compass search method. Root compression and cepstral smoothing methods have also been implemented to boost the recognition performance. The additional complexity and memory requirements of the proposed system are minimum. The performance of the proposed system was compared to the European Telecommunications Standards Institute (ETSI) standardized system. The proposed system outperforms the ETSI system by up to 8.6 % relative increase in word accuracy and achieves up to 35.1 % relative increase in word accuracy compared to the existing baseline system on the ETSI Aurora 3 German task. A relative increase of up to 18 % in word accuracy over the existing baseline system is also obtained from the proposed weighting rules on large vocabulary databases. An entropy-based feature vector analysis method has also been developed to assess the quality of feature vectors. The entropy estimation is based on the histogram approach. The method has the advantage to objectively asses the feature vector quality regardless of the acoustic modeling assumption used in the speech recognition system

Universität der Bundeswehr München: AtheneForschung

Self-Calibration Methods for Uncontrolled Environments in Sensor Networks: A Reference Survey

Author: Badache Nadjib
Barcelo-Ordinas Jose M.
Doudou Messaud
Garcia-Vidal Jorge
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Growing progress in sensor technology has constantly expanded the number and range of low-cost, small, and portable sensors on the market, increasing the number and type of physical phenomena that can be measured with wirelessly connected sensors. Large-scale deployments of wireless sensor networks (WSN) involving hundreds or thousands of devices and limited budgets often constrain the choice of sensing hardware, which generally has reduced accuracy, precision, and reliability. Therefore, it is challenging to achieve good data quality and maintain error-free measurements during the whole system lifetime. Self-calibration or recalibration in ad hoc sensor networks to preserve data quality is essential, yet challenging, for several reasons, such as the existence of random noise and the absence of suitable general models. Calibration performed in the field, without accurate and controlled instrumentation, is said to be in an uncontrolled environment. This paper provides current and fundamental self-calibration approaches and models for wireless sensor networks in uncontrolled environments

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Keskusteluavustimen kehittäminen kuulovammaisia varten automaattista puheentunnistusta käyttäen

Author: Lukkarila Juri
Publication venue
Publication date: 11/12/2017
Field of study

Understanding and participating in conversations has been reported as one of the biggest challenges hearing impaired people face in their daily lives. These communication problems have been shown to have wide-ranging negative consequences, affecting their quality of life and the opportunities available to them in education and employment. A conversational assistance application was investigated to alleviate these problems. The application uses automatic speech recognition technology to provide real-time speech-to-text transcriptions to the user, with the goal of helping deaf and hard of hearing persons in conversational situations. To validate the method and investigate its usefulness, a prototype application was developed for testing purposes using open-source software. A user test was designed and performed with test participants representing the target user group. The results indicate that the Conversation Assistant method is valid, meaning it can help the hearing impaired to follow and participate in conversational situations. Speech recognition accuracy, especially in noisy environments, was identified as the primary target for further development for increased usefulness of the application. Conversely, recognition speed was deemed to be sufficient and already surpass the transcription speed of human transcribers.Keskustelupuheen ymmärtäminen ja keskusteluihin osallistuminen on raportoitu yhdeksi suurimmista haasteista, joita kuulovammaiset kohtaavat jokapäiväisessä elämässään. Näillä viestintäongelmilla on osoitettu olevan laaja-alaisia negatiivisia vaikutuksia, jotka heijastuvat elämänlaatuun ja heikentävät kuulovammaisten yhdenvertaisia osallistumismahdollisuuksia opiskeluun ja työelämään. Työssä kehitettiin ja arvioitiin apusovellusta keskustelupuheen ymmärtämisen ja keskusteluihin osallistumisen helpottamiseksi. Sovellus käyttää automaattista puheentunnistusta reaaliaikaiseen puheen tekstittämiseen kuuroja ja huonokuuloisia varten. Menetelmän toimivuuden vahvistamiseksi ja sen hyödyllisyyden tutkimiseksi siitä kehitettiin prototyyppisovellus käyttäjätestausta varten avointa lähdekoodia hyödyntäen. Testaamista varten suunniteltiin ja toteutettiin käyttäjäkoe sovelluksen kohderyhmää edustavilla koekäyttäjillä. Saadut tulokset viittaavat siihen, että työssä esitetty Keskusteluavustin on toimiva ja hyödyllinen apuväline huonokuuloisille ja kuuroille. Puheentunnistustarkkuus erityisesti meluisissa olosuhteissa osoittautui ensisijaiseksi kehityskohteeksi apusovelluksen hyödyllisyyden lisäämiseksi. Puheentunnistuksen nopeus arvioitiin puolestaan jo riittävän nopeaksi, ylittäen selkeästi kirjoitustulkkien kirjoitusnopeuden

Aaltodoc Publication Archive

Hybrid wheelchair controller for handicapped and quadriplegic patients

Author: Al-Okby Mohammed Faeik Ruzaij (gnd: 1151235431)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

In this dissertation, a hybrid wheelchair controller for handicapped and quadriplegic patient is proposed. The system has two sub-controllers which are the voice controller and the head tilt controller. The system aims to help quadriplegic, handicapped, elderly and paralyzed patients to control a robotic wheelchair using voice commands and head movements instead of a traditional joystick controller. The multi-input design makes the system more flexible to adapt to the available body signals. The low-cost design is taken into consideration as it allows more patients to use this system

Rostocker Dokumentenserver

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)

In Car Audio

Author: A. Lattanzi
A. Primavera
C. Levy
C. Pilato
D. Sciuto
E. Capucci
E. Ciavattini
F. Bettarelli
F. Capman
F. Ferrandi
F. Piazza
J.F. Bonastre
J.G.F. Coutinho
L. Palestrini
M. Lattuada
P. Peretti
R. Toppi
S. Cecchi
S. Thabuteau
W. Luk
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

This chapter presents implementations of advanced in Car Audio Applications. The system is composed by three main different applications regarding the In Car listening and communication experience. Starting from a high level description of the algorithms, several implementations on different levels of hardware abstraction are presented, along with empirical results on both the design process undergone and the performance results achieved

Archivio istituzionale della ricerca - Politecnico di Milano

Wireless sensor systems in indoor situation modeling II (WISM II)

Author
Publication venue: Vaasan yliopisto
Publication date: 01/01/2013
Field of study

fi=vertaisarvioimaton|en=nonPeerReviewed

Osuva

Augmented Reality

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Augmented Reality (AR) is a natural development from virtual reality (VR), which was developed several decades earlier. AR complements VR in many ways. Due to the advantages of the user being able to see both the real and virtual objects simultaneously, AR is far more intuitive, but it's not completely detached from human factors and other restrictions. AR doesn't consume as much time and effort in the applications because it's not required to construct the entire virtual scene and the environment. In this book, several new and emerging application areas of AR are presented and divided into three sections. The first section contains applications in outdoor and mobile AR, such as construction, restoration, security and surveillance. The second section deals with AR in medical, biological, and human bodies. The third and final section contains a number of new and useful applications in daily living and learning

Directory of Open Access Books (DOAB)