494 research outputs found
Recommended from our members
On the Correlation between Energy and Pitch Accent in Read English Speech
In this paper, we describe a set of experiments that examine the correlation between energy and pitch accent. We tested the discriminative power of the energy component of frequency sub- bands with a variety of frequencies and bandwidths on read speech spoken by four native speakers of Standard American English, us- ing an analysis by classification approach. We found that the frequency region most robust to speaker differences is between 2 and 20 bark. Across all speakers, using only energy features we were able to predict pitch accent in read speech with accuracy of 81.9%
Perceptually Motivated Wavelet Packet Transform for Bioacoustic Signal Enhancement
A significant and often unavoidable problem in bioacoustic signal processing is the presence of background noise due to an adverse recording environment. This paper proposes a new bioacoustic signal enhancement technique which can be used on a wide range of species. The technique is based on a perceptually scaled wavelet packet decomposition using a species-specific Greenwood scale function. Spectral estimation techniques, similar to those used for human speech enhancement, are used for estimation of clean signal wavelet coefficients under an additive noise model. The new approach is compared to several other techniques, including basic bandpass filtering as well as classical speech enhancement methods such as spectral subtraction, Wiener filtering, and Ephraim–Malah filtering. Vocalizations recorded from several species are used for evaluation, including the ortolan bunting (Emberiza hortulana), rhesus monkey (Macaca mulatta), and humpback whale (Megaptera novaeanglia), with both additive white Gaussian noise and environment recording noise added across a range of signal-to-noise ratios (SNRs). Results, measured by both SNR and segmental SNR of the enhanced wave forms, indicate that the proposed method outperforms other approaches for a wide range of noise conditions
Teager energy based feature parameters for speech recognition in car noise
Cataloged from PDF version of article.In this letter, a new set of speech feature parameters
based on multirate signal processing and the Teager energy
operator is introduced. The speech signal is first divided into
nonuniform subbands in mel-scale using a multirate filterbank,
then the Teager energies of the subsignals are estimated. Finally,
the feature vector is constructed by log-compression and inverse
discrete cosine transform (DCT) computation. The new feature
parameters have robust speech recognition performance in the
presence of car engine noise
Improved compactly computable objective measures for predicting the acceptiability of speech communications systems
Issued as Monthly status reports [1-7], and Final report, Project no. E-21-61
Falling person detection using multisensor signal processing
Falls are one of the most important problems for frail and elderly people living independently. Early detection of falls is vital to provide a safe and active lifestyle for elderly. Sound, passive infrared (PIR) and vibration sensors can be placed in a supportive home environment to provide information about daily activities of an elderly person. In this paper, signals produced by sound, PIR and vibration sensors are simultaneously analyzed to detect falls. Hidden Markov Models are trained for regular and unusual activities of an elderly person and a pet for each sensor signal. Decisions of HMMs are fused together to reach a final decision
Recommended from our members
Breathing Signature as Vitality Score Index Created by Exercises of Qigong: Implications of Artificial Intelligence Tools Used in Traditional Chinese Medicine.
Rising concerns about the short- and long-term detrimental consequences of administration of conventional pharmacopeia are fueling the search for alternative, complementary, personalized, and comprehensive approaches to human healthcare. Qigong, a form of Traditional Chinese Medicine, represents a viable alternative approach. Here, we started with the practical, philosophical, and psychological background of Ki (in Japanese) or Qi (in Chinese) and their relationship to Qigong theory and clinical application. Noting the drawbacks of the current state of Qigong clinic, herein we propose that to manage the unique aspects of the Eastern 'non-linearity' and 'holistic' approach, it needs to be integrated with the Western "linearity" "one-direction" approach. This is done through developing the concepts of "Qigong breathing signatures," which can define our life breathing patterns associated with diseases using machine learning technology. We predict that this can be achieved by establishing an artificial intelligence (AI)-Medicine training camp of databases, which will integrate Qigong-like breathing patterns with different pathologies unique to individuals. Such an integrated connection will allow the AI-Medicine algorithm to identify breathing patterns and guide medical intervention. This unique view of potentially connecting Eastern Medicine and Western Technology can further add a novel insight to our current understanding of both Western and Eastern medicine, thereby establishing a vitality score index (VSI) that can predict the outcomes of lifestyle behaviors and medical conditions
Analysis and detection of human emotion and stress from speech signals
Ph.DDOCTOR OF PHILOSOPH
Bimodal Emotion Recognition using Speech and Physiological Changes
With exponentially evolving technology it is no exaggeration to say that any interface fo
- …