Search CORE

1,325 research outputs found

A voice activity detection algorithm with sub-band detection based on time-frequency characteristics of mandarin

Author: Huang Shaoguang
Wang Yinfeng
Wei Ying
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Voice activity detection algorithms are widely used in the areas of voice compression, speech synthesis, speech recognition, speech enhancement, and etc. In this paper, an efficient voice activity detection algorithm with sub-band detection based on time-frequency characteristics of mandarin is proposed. The proposed sub-band detection consists of two parts: crosswise detection and lengthwise detection. Energy detection and pitch detection are in the range of considerations. For a better performance, double-threshold criterion is used to reduce the misjudgment rate of the detection. Performance evaluation is based on six noise environments with different SNRs. Experiment results indicate that the proposed algorithm can detect the area of voice effectively in non-stationary environment and low SNR environment and has the potential to progress

Ghent University Academic Bibliography

An efficient voice activity detection algorithm by combining statistical model and energy detection

Author: A Benyassine
A Davis
B Schölkopf
B-F Wu
D Kim
E Nemer
G Evangelopoulos
G Ying
ITU-T Rec P.48
ITU-T Rec P.56
J Garofolo
J Ramírez
J Ramírez
J Ramírez
J Ramírez
J Shen
J Sohn
JG Wilpon
JH Chang
JW Shin
K Li
L Huang
LR Rabiner
Q Jo
Q Li
R Chengalvarayan
R Le Bouquin-Jeannès
R Tahmasbi
S Gazor
S Kang
S Kay
S Kuroiwa
T Yu
TV Pham
Y Ephraim
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A non-linear VAD for noisy environments

Author: A Hyvärinen
AJ Stam
CE Shannon
E-K Kim
G Altmann
J Solé-Casals
J-F Cardoso
JM Górriz
Jordi Solé-Casals
K Ozeki
Vladimir Zaiats
VV Buldygin
W Härdle
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2010
Field of study

This paper deals with non-linear transformations for improving the performance of an entropy-based voice activity detector (VAD). The idea to use a non-linear transformation has already been applied in the field of speech linear prediction, or linear predictive coding (LPC), based on source separation techniques, where a score function is added to classical equations in order to take into account the true distribution of the signal. We explore the possibility of estimating the entropy of frames after calculating its score function, instead of using original frames. We observe that if the signal is clean, the estimated entropy is essentially the same; if the signal is noisy, however, the frames transformed using the score function may give entropy that is different in voiced frames as compared to nonvoiced ones. Experimental evidence is given to show that this fact enables voice activity detection under high noise, where the simple entropy method fails

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RIUVic

Features for voice activity detection: a comparative analysis

Author: Gerhard Schmidt
Markus Buck
Simon Graf
Tobias Herbig
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Springer - Publisher Connector

Speech Endpoint Detection: An Image Segmentation Approach

Author: Faris Nesma
Publication venue: 'University of Waterloo'
Publication date: 01/01/2013
Field of study

Speech Endpoint Detection, also known as Speech Segmentation, is an unsolved problem in speech processing that affects numerous applications including robust speech recognition. This task is not as trivial as it appears, and most of the existing algorithms degrade at low signal-to-noise ratios (SNRs). Most of the previous research approaches have focused on the development of robust algorithms with special attention being paid to the derivation and study of noise robust features and decision rules. This research tackles the endpoint detection problem in a different way, and proposes a novel speech endpoint detection algorithm which has been derived from Chan-Vese algorithm for image segmentation. The proposed algorithm has the ability to fuse multi features extracted from the speech signal to enhance the detection accuracy. The algorithm performance has been evaluated and compared to two widely used speech detection algorithms under various noise environments with SNR levels ranging from 0 dB to 30 dB. Furthermore, the proposed algorithm has also been applied to different types of American English phonemes. The experiments show that, even under conditions of severe noise contamination, the proposed algorithm is more efficient as compared to the reference algorithms

University of Waterloo's Institutional Repository

Detecting emotions from speech using machine learning techniques

Author: Roy Tanmoy
Publication venue
Publication date: 01/01/2019
Field of study

D.Phil. (Electronic Engineering

University of Johannesburg Institutional Repository

Characterization of damage evolution on metallic components using ultrasonic non-destructive methods

Author: Piñal Moctezuma Juan Fernando
Publication venue: Universitat Politècnica de Catalunya
Publication date: 27/09/2019
Field of study

When fatigue is considered, it is expected that structures and machinery eventually fail. Still, when this damage is unexpected, besides of the negative economic impact that it produces, life of people could be potentially at risk. Thus, nowadays it is imperative that the infrastructure managers, ought to program regular inspection and maintenance for their assets; in addition, designers and materials manufacturers, can access to appropriate diagnostic tools in order to build superior and more reliable materials. In this regard, and for a number of applications, non-destructive evaluation techniques have proven to be an efficient and helpful alternative to traditional destructive assays of materials. Particularly, for the design area of materials, in recent times researchers have exploited the Acoustic Emission (AE) phenomenon as an additional assessing tool with which characterize the mechanical properties of specimens. Nevertheless, several challenges arise when treat said phenomenon, since its intensity, duration and arrival behavior is essentially stochastic for traditional signal processing means, leading to inaccuracies for the outcome assessment. In this dissertation, efforts are focused on assisting in the characterization of the mechanical properties of advanced high strength steels during under uniaxial tensile tests. Particularly of interest, is being able to detect the nucleation and growth of a crack throughout said test. Therefore, the resulting AE waves generated by the specimen during the test are assessed with the aim of characterize their evolution. For this, on the introduction, a brief review about non-destructive methods emphasizing the AE phenomenon is introduced. Next is presented, an exhaustive analysis with regard to the challenge and deficiencies of detecting and segmenting each AE event over a continuous data-stream with the traditional threshold detection method, and additionally, with current state of the art methods. Following, a novel AE event detection method is proposed, with the aim of overcome the aforementioned limitations. Evidence showed that the proposed method (which is based on the short-time features of the waveform of the AE signal), excels the detection capabilities of current state of the art methods, when onset and endtime precision, as well as when quality of detection and computational speed are also considered. Finally, a methodology aimed to analyze the frequency spectrum evolution of the AE phenomenon during the tensile test, is proposed. Results indicate that it is feasible to correlate nucleation and growth of a crack with the frequency content evolution of AE events.Cuando se considera la fatiga de los materiales, se espera que eventualmente las estructuras y las maquinarias fallen. Sin embargo, cuando este daño es inesperado, además del impacto económico que este produce, la vida de las personas podría estar potencialmente en riesgo. Por lo que hoy en día, es imperativo que los administradores de las infraestructuras deban programar evaluaciones y mantenimientos de manera regular para sus activos. De igual manera, los diseñadores y fabricantes de materiales deberían de poseer herramientas de diagnóstico apropiadas con el propósito de obtener mejores y más confiables materiales. En este sentido, y para un amplio número de aplicaciones, las técnicas de evaluación no destructivas han demostrado ser una útil y eficiente alternativa a los ensayos destructivos tradicionales de materiales. De manera particular, en el área de diseño de materiales, recientemente los investigadores han aprovechado el fenómeno de Emisión Acústica (EA) como una herramienta complementaria de evaluación, con la cual poder caracterizar las propiedades mecánicas de los especímenes. No obstante, una multitud de desafíos emergen al tratar dicho fenómeno, ya que el comportamiento de su intensidad, duración y aparición es esencialmente estocástico desde el punto de vista del procesado de señales tradicional, conllevando a resultados imprecisos de las evaluaciones. Esta disertación se enfoca en colaborar en la caracterización de las propiedades mecánicas de Aceros Avanzados de Alta Resistencia (AAAR), para ensayos de tracción de tensión uniaxiales, con énfasis particular en la detección de fatiga, esto es la nucleación y generación de grietas en dichos componentes metálicos. Para ello, las ondas mecánicas de EA que estos especímenes generan durante los ensayos, son estudiadas con el objetivo de caracterizar su evolución. En la introducción de este documento, se presenta una breve revisión acerca de los métodos existentes no destructivos con énfasis particular al fenómeno de EA. A continuación, se muestra un análisis exhaustivo respecto a los desafíos para la detección de eventos de EA y las y deficiencias del método tradicional de detección; de manera adicional se evalúa el desempeño de los métodos actuales de detección de EA pertenecientes al estado del arte. Después, con el objetivo de superar las limitaciones presentadas por el método tradicional, se propone un nuevo método de detección de actividad de EA; la evidencia demuestra que el método propuesto (basado en el análisis en tiempo corto de la forma de onda), supera las capacidades de detección de los métodos pertenecientes al estado del arte, cuando se evalúa la precisión de la detección de la llegada y conclusión de las ondas de EA; además de, cuando también se consideran la calidad de detección de eventos y la velocidad de cálculo. Finalmente, se propone una metodología con el propósito de evaluar la evolución de la energía del espectro frecuencial del fenómeno de EA durante un ensayo de tracción; los resultados demuestran que es posible correlacionar el contenido de dicha evolución frecuencial con respecto a la nucleación y crecimiento de grietas en AAAR's.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

AN ADAPTIVE VOICE ACTIVITY DETECTION ALGORITHM

Author: Junqin Huang
Zhigang Zhang
Publication venue: 'Exeley, Inc.'
Publication date
Field of study

Exeley Inc.