Search CORE

20,795 research outputs found

Prosodic-Enhanced Siamese Convolutional Neural Networks for Cross-Device Text-Independent Speaker Verification

Author: Dabouei Ali
Dawson Jeremy
Iranmanesh Seyed Mehdi
Kazemi Hadi
Nasrabadi Nasser M.
Soleymani Sobhan
Publication venue
Publication date: 31/07/2018
Field of study

In this paper a novel cross-device text-independent speaker verification architecture is proposed. Majority of the state-of-the-art deep architectures that are used for speaker verification tasks consider Mel-frequency cepstral coefficients. In contrast, our proposed Siamese convolutional neural network architecture uses Mel-frequency spectrogram coefficients to benefit from the dependency of the adjacent spectro-temporal features. Moreover, although spectro-temporal features have proved to be highly reliable in speaker verification models, they only represent some aspects of short-term acoustic level traits of the speaker's voice. However, the human voice consists of several linguistic levels such as acoustic, lexicon, prosody, and phonetics, that can be utilized in speaker verification models. To compensate for these inherited shortcomings in spectro-temporal features, we propose to enhance the proposed Siamese convolutional neural network architecture by deploying a multilayer perceptron network to incorporate the prosodic, jitter, and shimmer features. The proposed end-to-end verification architecture performs feature extraction and verification simultaneously. This proposed architecture displays significant improvement over classical signal processing approaches and deep algorithms for forensic cross-device speaker verification.Comment: Accepted in 9th IEEE International Conference on Biometrics: Theory, Applications, and Systems (BTAS 2018

arXiv.org e-Print Archive

Crossref

Goldilocks Forgetting in Cross-Situational Learning

Author: Akhtar
Allport
Anderson
Anderson
Bauer
Baxter
Baxter
Behrend
Blythe
Brainerd
Brooks
Bruner
Bybee
Crain
Croft
Cunillera
Dabrowska
Dabrowska
Delaney
Ebbinghaus
Ellis
Elman
Fazly
Frank
Givón
Gleitman
Goldberg
Golinkoff
Gopnik
Harris
Hebb
Huttenlocher
Ibbotson
Ibbotson
Ibbotson
Ibbotson
Ibbotson
Ibbotson
Jurafsky
Kachergis
Kachergis
Kachergis
Kamhi
Langacker
Langacker
MacDonald
Markman
Markman
McMurray
Medina
Munakata
Murdock
Nazzi
Nelson
Nowak
Pavlik
Pinker
Quine
Reznick
Rovee-Collier
Roy
Scott
Seidenberg
Shiffrin
Siskind
Smith
Street
Suanda
Taylor
Tilles
Tomasello
Tomasello
Toppino
Trueswell
Vlach
Vlach
Vlach
Vlach
Vlach
Vouloumanos
Wittgenstein
Xu
Yu
Yu
Yurovsky
Yurovsky
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Given that there is referential uncertainty (noise) when learning words, to what extent can forgetting filter some of that noise out, and be an aid to learning? Using a Cross Situational Learning model we find a U-shaped function of errors indicative of a "Goldilocks" zone of forgetting: an optimum store-loss ratio that is neither too aggressive nor too weak, but just the right amount to produce better learning outcomes. Forgetting acts as a high-pass filter that actively deletes (part of) the referential ambiguity noise, retains intended referents, and effectively amplifies the signal. The model achieves this performance without incorporating any specific cognitive biases of the type proposed in the constraints and principles account, and without any prescribed developmental changes in the underlying learning mechanism. Instead we interpret the model performance as more of a by-product of exposure to input, where the associative strengths in the lexicon grow as a function of linguistic experience in combination with memory limitations. The result adds a mechanistic explanation for the experimental evidence on spaced learning and, more generally, advocates integrating domain-general aspects of cognition, such as memory, into the language acquisition process

Crossref

Directory of Open Access Journals

Open Research Online (The Open University)

The University of Manchester - Institutional Repository

Spatial evolution of human dialects

Author: Burridge James
Publication venue: 'American Physical Society (APS)'
Publication date: 01/07/2017
Field of study

The geographical pattern of human dialects is a result of history. Here, we formulate a simple spatial model of language change which shows that the final result of this historical evolution may, to some extent, be predictable. The model shows that the boundaries of language dialect regions are controlled by a length minimizing effect analogous to surface tension, mediated by variations in population density which can induce curvature, and by the shape of coastline or similar borders. The predictability of dialect regions arises because these effects will drive many complex, randomized early states toward one of a smaller number of stable final configurations. The model is able to reproduce observations and predictions of dialectologists. These include dialect continua, isogloss bundling, fanning, the wave-like spread of dialect features from cities, and the impact of human movement on the number of dialects that an area can support. The model also provides an analytical form for S\'{e}guy's Curve giving the relationship between geographical and linguistic distance, and a generalisation of the curve to account for the presence of a population centre. A simple modification allows us to analytically characterize the variation of language use by age in an area undergoing linguistic change

arXiv.org e-Print Archive

Directory of Open Access Journals

Portsmouth University Research Portal (Pure)