37 research outputs found
VoxCeleb2: Deep Speaker Recognition
The objective of this paper is speaker recognition under noisy and
unconstrained conditions.
We make two key contributions. First, we introduce a very large-scale
audio-visual speaker recognition dataset collected from open-source media.
Using a fully automated pipeline, we curate VoxCeleb2 which contains over a
million utterances from over 6,000 speakers. This is several times larger than
any publicly available speaker recognition dataset.
Second, we develop and compare Convolutional Neural Network (CNN) models and
training strategies that can effectively recognise identities from voice under
various conditions. The models trained on the VoxCeleb2 dataset surpass the
performance of previous works on a benchmark dataset by a significant margin.Comment: To appear in Interspeech 2018. The audio-visual dataset can be
downloaded from http://www.robots.ox.ac.uk/~vgg/data/voxceleb2 .
1806.05622v2: minor fixes; 5 page
Syllable frequency effects in immediate but not delayed syllable naming in English
<p>Syllable frequency effects in production tasks are interpreted as evidence that speakers retrieve precompiled articulatory programs for high frequency syllables from a mental syllabary. They have not been found reliably in English, nor isolated to the phonetic encoding processes during which the syllabary is thought to be accessed. In this experiment, 48 participants produced matched high- and novel/low-frequency syllables in a near-replication of Laganaro and Alario’s [(2006) On the locus of the syllable frequency effect in speech production. <i>Journal of Memory and Language, 55</i>(2), 198–196, <a href="http://dx.doi.org/10.1016/j.jml.2006.05.001" target="_blank">http://dx.doi.org/10.1016/j.jml.2006.05.001</a>] production conditions: immediate naming, naming following an unfilled delay, and naming after delay filled by concurrent articulation. Immediate naming was faster for high frequency syllables, demonstrating a robust syllable frequency effect in English. There was no high frequency advantage in either delayed naming condition, leaving open the question of whether syllable frequency effects arise during phonological or phonetic encoding.</p
The Automatic Neuroscientist: automated experimental design with real-time fMRI
A standard approach in functional neuroimaging explores how a particular
cognitive task activates a set of brain regions (one task-to-many regions
mapping). Importantly though, the same neural system can be activated by
inherently different tasks. To date, there is no approach available that
systematically explores whether and how distinct tasks probe the same neural
system (many tasks-to-region mapping). In our work, presented here we propose
an alternative framework, the Automatic Neuroscientist, which turns the typical
fMRI approach on its head. We use real-time fMRI in combination with
state-of-the-art optimisation techniques to automatically design the optimal
experiment to evoke a desired target brain state. Here, we present two
proof-of-principle studies involving visual and auditory stimuli. The data
demonstrate this closed-loop approach to be very powerful, hugely speeding up
fMRI and providing an accurate estimation of the underlying relationship
between stimuli and neural responses across an extensive experimental parameter
space. Finally, we detail four scenarios where our approach can be applied,
suggesting how it provides a novel description of how cognition and the brain
interrelate.Comment: 22 pages, 7 figures, work presented at OHBM 201
Effects of Gender and Regional Dialect on Uptalk in the American Midwest
This study compares the distribution of uptalk contours across male and female speakers of two Midwestern dialects of American English. Sixteen speakers, evenly divided between dialect and gender, were recorded reading ten passages in plain lab speech. The contours defined as uptalk in this study were H* H-H%, H* L-H%, L* H-H%, and L* L-H%. The results indicate that neither gender nor dialect had an effect on overall uptalk frequency, which could reflect prosodic similarities in the two dialects. The null results for gender are particularly surprising because they run contrary to many of the previous studies on uptalk, which found that women use uptalk more than men. Gender and dialect also had no significant effect on the types of uptalk contours used: speakers from both dialects used primarily three of the four uptalk contours that were examined (H* L-H%, L* H-H% and L* L-H%). These uptalk contours differed from the uptalk contours identified in other North American varieties of English, suggesting that there are regional differences in uptalk realization.Undergraduate Research ScholarshipNational Science Foundation (BCS-1056409)No embargoAcademic Major: LinguisticsAcademic Major: Spanis