Search CORE

723,393 research outputs found

Individual differences in the discrimination of novel speech sounds: effects of sex, temporal processing, musical and cognitive abilities

Author: Brooks Patricia J.
Kempe Vera
Kirk Neil W.
Schaeffler Felix
Thoresen John C.
Publication venue
Publication date: 01/01/2012
Field of study

This study examined whether rapid temporal auditory processing, verbal working memory capacity, non-verbal intelligence, executive functioning, musical ability and prior foreign language experience predicted how well native English speakers (N = 120) discriminated Norwegian tonal and vowel contrasts as well as a non-speech analogue of the tonal contrast and a native vowel contrast presented over noise. Results confirmed a male advantage for temporal and tonal processing, and also revealed that temporal processing was associated with both non-verbal intelligence and speech processing. In contrast, effects of musical ability on non-native speech-sound processing and of inhibitory control on vowel discrimination were not mediated by temporal processing. These results suggest that individual differences in non-native speech-sound processing are to some extent determined by temporal auditory processing ability, in which males perform better, but are also determined by a host of other abilities that are deployed flexibly depending on the characteristics of the target sounds

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

City University of New York

Abertay Research Portal

Crossref

Directory of Open Access Journals

PubMed Central

Queen Margaret University eResearch

The Francis Crick Institute

Does anticipation help or hinder performance in a subsequent speech?

Author: Brown Mike
Stopa Lusia
Publication venue: 'Cambridge University Press (CUP)'
Publication date
Field of study

This study examined the effects of anticipatory processing on a subsequent speech in high and low socially anxious participants. Forty participants (n = 20 in each group) gave two speeches, one after no anticipatory processing and one after 10-minutes of anticipatory processing. In anticipatory processing, high socially anxious participants were more anxious, and experienced more negative and unhelpful self-images than low socially anxious participants did. However, both groups rated memories of past speeches as having a somewhat helpful effect on their speech preparation. High socially anxious participants tended to use the observer perspective more in the anticipated speech, while, in the unanticipated speech, they might have been switching between observer and field perspectives. Low socially anxious participants tended to use the field perspective in both speeches. High and low socially anxious participants reported better speech performances after the anticipated, compared to after the unanticipated speech. Results suggest that anticipatory processing may have both positive and negative effects on socially anxious individuals' cognitive processing and performance before and during a speech

Southampton (e-Prints Soton)

SKOPE: A connectionist/symbolic architecture of spoken Korean processing

Author: Lee Geunbae
Lee Jong-Hyeok
Publication venue
Publication date: 24/04/1995
Field of study

Spoken language processing requires speech and natural language integration. Moreover, spoken Korean calls for unique processing methodology due to its linguistic characteristics. This paper presents SKOPE, a connectionist/symbolic spoken Korean processing engine, which emphasizes that: 1) connectionist and symbolic techniques must be selectively applied according to their relative strength and weakness, and 2) the linguistic characteristics of Korean must be fully considered for phoneme recognition, speech and language integration, and morphological/syntactic processing. The design and implementation of SKOPE demonstrates how connectionist/symbolic hybrid architectures can be constructed for spoken agglutinative language processing. Also SKOPE presents many novel ideas for speech and language processing. The phoneme recognition, morphological analysis, and syntactic analysis experiments show that SKOPE is a viable approach for the spoken Korean processing.Comment: 8 pages, latex, use aaai.sty & aaai.bst, bibfile: nlpsp.bib, to be presented at IJCAI95 workshops on new approaches to learning for natural language processin

arXiv.org e-Print Archive

Crossref

포항공과대학교

Speech vocoding for laboratory phonology

Author: Benus Stefan
Cernak Milos
Lazaridis Alexandros
Publication venue: 'Elsevier BV'
Publication date: 19/05/2015
Field of study

Using phonological speech vocoding, we propose a platform for exploring relations between phonology and speech processing, and in broader terms, for exploring relations between the abstract and physical structures of a speech signal. Our goal is to make a step towards bridging phonology and speech processing and to contribute to the program of Laboratory Phonology. We show three application examples for laboratory phonology: compositional phonological speech modelling, a comparison of phonological systems and an experimental phonological parametric text-to-speech (TTS) system. The featural representations of the following three phonological systems are considered in this work: (i) Government Phonology (GP), (ii) the Sound Pattern of English (SPE), and (iii) the extended SPE (eSPE). Comparing GP- and eSPE-based vocoded speech, we conclude that the latter achieves slightly better results than the former. However, GP - the most compact phonological speech representation - performs comparably to the systems with a higher number of phonological features. The parametric TTS based on phonological speech representation, and trained from an unlabelled audiobook in an unsupervised manner, achieves intelligibility of 85% of the state-of-the-art parametric speech synthesis. We envision that the presented approach paves the way for researchers in both fields to form meaningful hypotheses that are explicitly testable using the concepts developed and exemplified in this paper. On the one hand, laboratory phonologists might test the applied concepts of their theoretical models, and on the other hand, the speech processing community may utilize the concepts developed for the theoretical phonological models for improvements of the current state-of-the-art applications

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Speech systems research at Texas Instruments

Author: Doddington George R.
Publication venue
Publication date
Field of study

An assessment of automatic speech processing technology is presented. Fundamental problems in the development and the deployment of automatic speech processing systems are defined and a technology forecast for speech systems is presented

NASA Technical Reports Server

Speech Processing in Computer Vision Applications

Author: Waterworth Nicholas
Publication venue: ScholarWorks@UARK
Publication date: 01/05/2020
Field of study

Deep learning has been recently proven to be a viable asset in determining features in the field of Speech Analysis. Deep learning methods like Convolutional Neural Networks facilitate the expansion of specific feature information in waveforms, allowing networks to create more feature dense representations of data. Our work attempts to address the problem of re-creating a face given a speaker\u27s voice and speaker identification using deep learning methods. In this work, we first review the fundamental background in speech processing and its related applications. Then we introduce novel deep learning-based methods to speech feature analysis. Finally, we will present our deep learning approaches to speaker identification and speech to face synthesis. The presented method can convert a speaker audio sample to an image of their predicted face. This framework is composed of several chained together networks, each with an essential step in the conversion process. These include Audio embedding, encoding, and face generation networks, respectively. Our experiments show that certain features can map to the face and that with a speaker\u27s voice, DNNs can create their face and that a GUI could be used in conjunction to display a speaker recognition network\u27s data

ScholarWorks@UARK

UARK (University of Arkansas )