Search CORE

1,378 research outputs found

Saudi Accented Arabic Voice Bank

Author: Alenazi Ammar
Alghamdi Mansour
Alhargan Fayez
Alkanhal Mohammed
Alkhairy Ashraf
Eldesouki Munir
Publication venue: King Saud University. Production and hosting by Elsevier B.V.
Publication date: 31/12/2008
Field of study

AbstractThe aim of this paper is to present an Arabic speech database that represents Arabic native speakers from all the cities of Saudi Arabia. The database is called the Saudi Accented Arabic Voice Bank (SAAVB). Preparing the prompt sheets, selecting the right speakers and transcribing their speech are some of the challenges that faced the project team. The procedures that meet these challenges are highlighted. SAAVB consists of 1033 speakers speak in Modern Standard Arabic with a Saudi accent. The SAAVB content is analyzed and the results are illustrated. The content was verified internally and externally by IBM Cairo and can be used to train speech engines such as automatic speech recognition and speaker verification systems

Elsevier - Publisher Connector

Proceedings: Voice Technology for Interactive Real-Time Command/Control Systems Application

Author: Breaux Robert
Curran P. Mike
Huff Edward M.
Publication venue
Publication date
Field of study

Speech understanding among researchers and managers, current developments in voice technology, and an exchange of information concerning government voice technology efforts are discussed

NASA Technical Reports Server

Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems

Author: Díaz de María Fernando
García-Moral Ana I.
Peláez Moreno Carmen
Solera Ureña R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Hybrid speech recognizers, where the estimation of the emission pdf of the states of Hidden Markov Models (HMMs), usually carried out using Gaussian Mixture Models (GMMs), is substituted by Artificial Neural Networks (ANNs) have several advantages over the classical systems. However, to obtain performance improvements, the computational requirements are heavily increased because of the need to train the ANN. Departing from the observation of the remarkable skewness of speech data, this paper proposes sifting out the training set and balancing the amount of samples per class. With this method the training time has been reduced 18 times while obtaining performances similar to or even better than those with the whole database, especially in noisy environments. However, the application of these reduced sets is not straightforward. To avoid the mismatch between training and testing conditions created by the modification of the distribution of the training data, a proper scaling of the a posteriori probabilities obtained and a resizing of the context window need to be performed as demonstrated in the paper.This work was supported in part by the regional grant (Comunidad Autónoma de Madrid-UC3M) CCG06-UC3M/TIC-0812 and in part by a project funded by the Spanish Ministry of Science and Innovation (TEC 2008-06382).Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Speech recognition through physical reservoir computing with neuromorphic nanowire networks

Author: Agliuzza M
De Leo
Milano G
Ricciardi C
Publication venue: place:345 E 47TH ST, NEW YORK, NY 10017 USA
Publication date: 01/01/2022
Field of study

The hardware implementation of the reservoir computing paradigm represents a key aspect for taking into advantage of neuromorphic data processing. In this context, self-organised nanonetworks represent a versatile and scalable computational substrate for multiple tasks by exploiting the emerging collective behaviour of the system arising from complexity. The emerging behaviour allows spatio-temporal processing of multiple input signals and relies on the nonlinear interaction in between a multitude of nanoscale memristive elements. By means of a physics-based grid-graph modeling, we report on the implementation of reservoir computing for a speech recognition task in a memristive nanonetwork based on nanowires (NWs) acting as a physical reservoir. Besides analysing the pre-processing step for the transduction of the audio samples in electrical stimuli to be applied to the physical reservoir, we analyse the effect of the network size and the adoption of virtual nodes on computing performances. Results show that memristive nanonetworks allow in materia implementation of reservoir computing for the realisation of brain-inspired neuromorphic systems with reduced training cost

Archivio istituzionale della ricerca - INRIM

Max-Planck-Institute for Psycholinguistics: Annual Report 2001

Author: Kelly A.
Melinger A.
Publication venue: MPI for Psycholinguistics
Publication date: 01/01/2001
Field of study

MPG.PuRe

Prosody and speech perception

Author: Kirakowski Jerzy Zdzislaw Jozef
Publication venue: The University of Edinburgh
Publication date: 01/01/1978
Field of study

The major concern of this thesis is with models of speech perception. Following Gibson's (1966) work on visual perception, it seeks to establish whether there are sources of information in the speech signal which can be responded to directly and which specify the units of information of speech. The treatment of intonation follows that of Halliday (1967) and rhythm that of Abercrombie (1967) . By "prosody" is taken to mean both the intonational and the rhythmic aspects of speech.Experiments one to four show the interdependence of prosody and grammar in the perception of speech, although they leave open the question of which sort of information is responded to first. Experiments five and six, employing a short-term memory paradigm and Morton's (1970) "suffix effect" explanation, demonstrate that prosody could well be responded to before grammar. Since the previous experiments suggested a close connection between the two, these results suggest that information about grammatical structures may well be given directly by prosody. In qthe final two experiments the amount of prosodic information in fluent speech that can be perceived independently of grammar and meaning is investigated. Although tone -group division seems to be given clearly enough by acoustic cues, there are problems of interpretation with the data on syllable stress assignments.In the concluding chapter, a three-stage model of speech perception is proposed, following never (1970), but incorporating prosodic analysis as an integral part of the processing. The obtained experimental results are integrated within this model

Edinburgh Research Archive

Quality and use of phonological representation in poor and normal readers

Author: Irausquin R.S.
Publication venue: [n.n.]
Publication date: 01/01/1997
Field of study

Tilburg University Repository

Neurocognitive Implications of Tangential Speech in Patients with Focal Brain Damage

Author: Vigliecca Nora Silvana
Publication venue: 'IntechOpen'
Publication date: 20/12/2017
Field of study

There are no studies on the neurocognitive implications of tangential speech (TS). This research aims to take a step forward in the study of narrative processing, by evaluating TS in a sample that helps to detect this deficit when it is neurogenic and recently manifested. The relationship between TS, secondary to focal brain injury, and neuropsychological and neuroanatomical variables was explored. A comprehensive neuropsychological battery was administered to 175 volunteers: 95 alert inpatients, without aphasia, without psychiatric history and without TS history, and 80 healthy participants, without TS. Results: TS (prevalence 16%) was independent of type or site of injury. An adverse effect of TS on global neuropsychological performance was observed. This effect was significantly related to attentional errors along with prolonged processing times but not to correct responses. Reliability and validity indices for the present TS screening scale were provided. Conclusion: Present results support the hypothesis that this neurogenic inability to spontaneously find, organize and communicate verbal information, beyond single words, depends on extended brain networks involving processes such as sustained attention, complex-syntax comprehension, the (implicit) interpretation and spontaneous recall of a narrative, and emotional and behavioral alterations. Early TS detection is advisable for prevention and treatment at any age

IntechOpen

Crossref

Transfer of self-instructional & metacognitive training of communication skills for people who have learning difficulties

Author: Williams W. Huw
Publication venue
Publication date: 01/01/1991
Field of study

Bangor University Research Portal

The role of explicit memory in syntactic persistence : effects of lexical cueing and load on sentence memory and sentence production

Author: Bernolet Sarah
Hartsuiker Robert
Zhang Chi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

Speakers' memory of sentence structure can persist and modulate the syntactic choices of subsequent utterances (i.e., structural priming). Much research on structural priming posited a multifactorial account by which an implicit learning process and a process related to explicit memory jointly contribute to the priming effect. Here, we tested two predictions from that account: (1) that lexical repetition facilitates the retrieval of sentence structures from memory; (2) that priming is partly driven by a short-term explicit memory mechanism with limited resources. In two pairs of structural priming and sentence structure memory experiments, we examined the effects of structural priming and its modulation by lexical repetition as a function of cognitive load in native Dutch speakers. Cognitive load was manipulated by interspersing the prime and target trials with easy or difficult mathematical problems. Lexical repetition boosted both structural priming (Experiments 1a-2a) and memory for sentence structure (Experiments 1b-2b) and did so with a comparable magnitude. In Experiment 1, there were no load effects, but in Experiment 2, with a stronger manipulation of load, both the priming and memory effects were reduced with a larger cognitive load. The findings support an explicit memory mechanism in structural priming that is cue-dependent and attention-demanding, consistent with a multifactorial account of structural priming

Ghent University Academic Bibliography

Directory of Open Access Journals

Institutional Repository Universiteit Antwerpen