Search CORE

211 research outputs found

Limited-data automatic speaker verification algorithm using band-limited phase-only correlation function

Author: Becerra Sánchez Aldonso
De la Rosa Vargas José Ismael
Pedroza Ramírez Angel David
Villa Hernández José de Jesús
Publication venue: 'The Scientific and Technological Research Council of Turkey'
Publication date: 01/07/2019
Field of study

In this paper, a new method to deal with automatic speaker verification based on band-limited phaseonly correlation (BLPOC) is proposed. The aim of this study is to validate the use of the BLPOC function as a new limited-data automatic speaker verification technique. Although some speaker verification techniques have high accuracy, efficiency usually depends on the extraction of complex theoretical information from speech signals and the amount of the data for training the algorithms. The BLPOC function is a high-accuracy biometric technique traditionally implemented in human identification by fingerprints (through image-matching)

Caxcan Repositorio Institucional de la Universidad Autónoma de Zacatecas

System level modelling with open source tools

Author: Hansen Jan
Jakobsen Mikkel Koefoed
Madsen Jan
Niaki Seyed Hosein Attarzadeh
Sander Ingo
Publication venue
Publication date: 01/01/2012
Field of study

Online Research Database In Technology

A large-scale analysis of the acoustic-phonetic markers of speaker sex.

Author: Dempster Gavin John
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/01/1996
Field of study

The research for this thesis lies within the fieIa of speaker characterisation through the acoustic-phonetic analysis of speech. The thesis consists of two parts: 1. An inv.estigation of the acoustic-phonetic differences between the speech of women and men; 2. An examination of the practicalities of automating the investigation to analyse a large speech database. The acoustic-phonetic markers of speaker sex examined here are the fundamental frequency, the formant frequencies, and the relative amplitude of the first harmonic. The aims of the investigation were, firstly, to establish to what extent these markers differentiate between the sexes, and secondly, to examine the extent of between- and within-speaker deviation from the female and male norms, or average values for each sex. These points were investigated by an automated acoustic-phonetic analysis of the TIMIT database, involving a data set of almost 16,000 segments of speech. An automated method was dev~loped to enable the signal processing and statistical analysis of a data set of this size. The problems to be encountered in the analysis of a highly variable data source (i.e. the acoustic speech waveform) are addressed

White Rose E-theses Online

OpenGrey Repository

Acoustic Approaches to Gender and Accent Identification

Author: DeMarco Andrea
Publication venue
Publication date: 01/06/2015
Field of study

There has been considerable research on the problems of speaker and language recognition from samples of speech. A less researched problem is that of accent recognition. Although this is a similar problem to language identification, di�erent accents of a language exhibit more fine-grained di�erences between classes than languages. This presents a tougher problem for traditional classification techniques. In this thesis, we propose and evaluate a number of techniques for gender and accent classification. These techniques are novel modifications and extensions to state of the art algorithms, and they result in enhanced performance on gender and accent recognition. The first part of the thesis focuses on the problem of gender identification, and presents a technique that gives improved performance in situations where training and test conditions are mismatched. The bulk of this thesis is concerned with the application of the i-Vector technique to accent identification, which is the most successful approach to acoustic classification to have emerged in recent years. We show that it is possible to achieve high accuracy accent identification without reliance on transcriptions and without utilising phoneme recognition algorithms. The thesis describes various stages in the development of i-Vector based accent classification that improve the standard approaches usually applied for speaker or language identification, which are insu�cient. We demonstrate that very good accent identification performance is possible with acoustic methods by considering di�erent i-Vector projections, frontend parameters, i-Vector configuration parameters, and an optimised fusion of the resulting i-Vector classifiers we can obtain from the same data. We claim to have achieved the best accent identification performance on the test corpus for acoustic methods, with up to 90% identification rate. This performance is even better than previously reported acoustic-phonotactic based systems on the same corpus, and is very close to performance obtained via transcription based accent identification. Finally, we demonstrate that the utilization of our techniques for speech recognition purposes leads to considerably lower word error rates. Keywords: Accent Identification, Gender Identification, Speaker Identification, Gaussian Mixture Model, Support Vector Machine, i-Vector, Factor Analysis, Feature Extraction, British English, Prosody, Speech Recognition

University of East Anglia digital repository

7th Annual Undergraduate Research Conference Abstract Book

Author: Missouri University of Science and Technology
Publication venue: Scholars\u27 Mine
Publication date: 06/04/2011
Field of study

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Recommended from our members

Effects of Attention on Multisensory Integration

Author: Barrera Steven
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The world presents information via a variety of sensory channels. To make sense of this information, we must determine what is relevant and ignore unhelpful noise. We then integrate congruent information within and across modalities to build coherent perceptions. Importantly, immediate goals and prevailing environmental factors may interact to affect our perceptual decisions. This dynamic process of multisensory integration is essential to successful perception in the real world, but can also lead to errors. The current project exploits some of these perceptual errors to explore how endogenous (task-directed) and exogenous (stimulus intensity) factors may influence multisensory integration. In a series of four experiments, we use the sound-induced flash illusion (SFI; Shams et al., 2000; 2002) and related audiovisual effects as indices of multisensory integration. Endogenous attention was manipulated using a focused attention visual task and a novel bimodal conditional attention task. In our first two experiments, we found that participants reported more illusions when attending to both sensory modalities. This effect was larger when the auditory stimuli were presented at near-threshold levels. Perceptual sensitivity (d′) was also found to decrease in the bimodal condition. We then manipulated auditory intensity in each of these tasks independently. Reports of the SFI were found to increase with the higher intensity auditory stimuli. However, differences in reporting these illusions within the same task were attributable to both changes in bias (c) and d′. Event-related potentials recorded in our first experiment revealed that the SFI was associated with smaller P3 potentials than found in valid targets. We also noted differences in the response-locked error positivity (Pe), with illusory stimuli having more positive amplitudes than real targets. However, the earlier occurring error-related negativity (ERN) was indistinguishable in real and illusory targets. This suggests that participants were less confident of the illusion during stimulus evaluation and one stage of response monitoring. We evaluate these results in terms of the directed attention and information reliability hypotheses (Andersen et al., 2004, 2005) and discuss how these and similar experiments may deepen our understanding of how multisensory perception is impacted at multiple stages of stimulus and response evaluation

eScholarship - University of California

COGNITIVE RADIO SOLUTION FOR IEEE 802.22

Author: Tachwali Yahia
Publication venue
Publication date: 01/01/2010
Field of study

Current wireless systems suffer severe radio spectrum underutilization due to a number of problematic issues, including wasteful static spectrum allocations; fixed radio functionalities and architectures; and limited cooperation between network nodes. A significant number of research efforts aim to find alternative solutions to improve spectrum utilization. Cognitive radio based on software radio technology is one such novel approach, and the impending IEEE 802.22 air interface standard is the first based on such an approach. This standard aims to provide wireless services in wireless regional area network using TV spectrum white spaces. The cognitive radio devices employed feature two fundamental capabilities, namely supporting multiple modulations and data-rates based on wireless channel conditions and sensing a wireless spectrum. Spectrum sensing is a critical functionality with high computational complexity. Although the standard does not specify a spectrum sensing method, the sensing operation has inherent timing and accuracy constraints.This work proposes a framework for developing a cognitive radio system based on a small form factor software radio platform with limited memory resources and processing capabilities. The cognitive radio systems feature adaptive behavior based on wireless channel conditions and are compliant with the IEEE 802.22 sensing constraints. The resource limitations on implementation platforms post a variety of challenges to transceiver configurability and spectrum sensing. Overcoming these fundamental features on small form factors paves the way for portable cognitive radio devices and extends the range of cognitive radio applications.Several techniques are proposed to overcome resource limitation on a small form factor software radio platform based on a hybrid processing architecture comprised of a digital signal processor and a field programmable gate array. Hardware reuse and task partitioning over a number of processing devices are among the techniques used to realize a configurable radio transceiver that supports several communication modes, including modulations and data rates. In particular, these techniques are applied to build configurable modulation architecture and a configurable synchronization. A mode-switching architecture based on circular buffers is proposed to facilitate a reliable transitioning between different communication modes.The feasibility of efficient spectrum sensing based on a compressive sampling technique called "Fast Fourier Sampling" is examined. The configuration parameters are analyzed mathematically, and performance is evaluated using computer simulations for local spectrum sensing applications. The work proposed herein features a cooperative Fast Fourier sampling scheme to extend the narrowband and wideband sensing performance of this compressive sensing technique.The précis of this dissertation establishes the foundation of efficient cognitive radio implementation on small form factor software radio of hybrid processing architecture

SHAREOK repository

The effects of singing exercises and melodic intonation therapy (MIT) on the male-to-female transgender voice

Author: Hershberger Ioanna Georgiadou
NC DOCKS at The University of North Carolina at Greensboro
Publication venue
Publication date: 01/01/2005
Field of study

" The purpose of this study was to test the efficacy of traditional voice therapy approaches in combination with singing exercises and Melodic Intonation Therapy (MIT) to aid male-to-female transgender individuals gain a more feminine sounding voice. Participants from this study were recruited from a transgender support group in Greensboro, North Carolina. Six male-to-female individuals ranging in age from 37 to 63 years volunteered to participate in the study. Participants were randomly divided into two groups: Three individuals received traditional voice therapy plus feminine language structures/vocabulary and nonverbal communication (Group 1), while the remaining three received traditional voice therapy plus singing exercises and MIT (Group 2). All participants received traditional voice therapy techniques. Quantitative results suggested increased Speaking Fundamental Frequencies (SFFs) for participants in both groups, however, a slightly higher SFF was present in Group 2. Descriptive analysis of the results showed that by the study's end, all participants presented with self-voice ratings (1-7 scale) that were higher than the ratings given by the participants at the beginning of the study. Also, at the end of the study, all four judges (two first-year speech-language pathology graduate students and two random volunteers) rated the participants with voice ratings that were above the ratings at the beginning of the study."--Abstract from author supplied metadata

The University of North Carolina at Greensboro

Report on IOCCG workshop

Author: Bernard S.
Boss E.
Bracher A.
Brewin R.
Bricaud A.
Brotas V.
Chase A.
Choi J.K.
Ciotti A.
Clementson L.
Devred E.
DiGiacomo P.
Dupouy Cécile
Hardman-Mountford N.
Hirata T.
Hirawake T.
Kim W.
Kostadinov T.
Kwiatkowska E.
Lavender S.
Moisan T.
Mouw C.
Son S.
Sosik H.
Uitz J.
Werdell J.
Zheng G.
Publication venue: Goddard Space Flight center
Publication date: 01/01/2015
Field of study

Horizon / Pleins textes