737 research outputs found
Dynamics of Vocalization-Induced Modulation of Auditory Cortical Activity at Mid-utterance
Background: Recent research has addressed the suppression of cortical sensory responses to altered auditory feedback that occurs at utterance onset regarding speech. However, there is reason to assume that the mechanisms underlying sensorimotor processing at mid-utterance are different than those involved in sensorimotor control at utterance onset. The present study attempted to examine the dynamics of event-related potentials (ERPs) to different acoustic versions of auditory feedback at mid-utterance.
Methodology/Principal findings: Subjects produced a vowel sound while hearing their pitch-shifted voice (100 cents), a sum of their vocalization and pure tones, or a sum of their vocalization and white noise at mid-utterance via headphones. Subjects also passively listened to playback of what they heard during active vocalization. Cortical ERPs were recorded in response to different acoustic versions of feedback changes during both active vocalization and passive listening. The results showed that, relative to passive listening, active vocalization yielded enhanced P2 responses to the 100 cents pitch shifts, whereas suppression effects of P2 responses were observed when voice auditory feedback was distorted by pure tones or white noise.
Conclusion/Significance: The present findings, for the first time, demonstrate a dynamic modulation of cortical activity as a function of the quality of acoustic feedback at mid-utterance, suggesting that auditory cortical responses can be enhanced or suppressed to distinguish self-produced speech from externally-produced sounds
CompILE: Compositional Imitation Learning and Execution
We introduce Compositional Imitation Learning and Execution (CompILE): a
framework for learning reusable, variable-length segments of
hierarchically-structured behavior from demonstration data. CompILE uses a
novel unsupervised, fully-differentiable sequence segmentation module to learn
latent encodings of sequential data that can be re-composed and executed to
perform new tasks. Once trained, our model generalizes to sequences of longer
length and from environment instances not seen during training. We evaluate
CompILE in a challenging 2D multi-task environment and a continuous control
task, and show that it can find correct task boundaries and event encodings in
an unsupervised manner. Latent codes and associated behavior policies
discovered by CompILE can be used by a hierarchical agent, where the high-level
policy selects actions in the latent code space, and the low-level,
task-specific policies are simply the learned decoders. We found that our
CompILE-based agent could learn given only sparse rewards, where agents without
task-specific policies struggle.Comment: ICML (2019
Training of Working Memory Impacts Neural Processing of Vocal Pitch Regulation
Working memory training can improve the performance of tasks that were not trained. Whether auditory-motor integration for voice control can benefit from working memory training, however, remains unclear. The present event-related potential (ERP) study examined the impact of working memory training on the auditory-motor processing of vocal pitch. Trained participants underwent adaptive working memory training using a digit span backwards paradigm, while control participants did not receive any training. Before and after training, both trained and control participants were exposed to frequency-altered auditory feedback while producing vocalizations. After training, trained participants exhibited significantly decreased N1 amplitudes and increased P2 amplitudes in response to pitch errors in voice auditory feedback. In addition, there was a significant positive correlation between the degree of improvement in working memory capacity and the post-pre difference in P2 amplitudes. Training-related changes in the vocal compensation, however, were not observed. There was no systematic change in either vocal or cortical responses for control participants. These findings provide evidence that working memory training impacts the cortical processing of feedback errors in vocal pitch regulation. This enhanced cortical processing may be the result of increased neural efficiency in the detection of pitch errors between the intended and actual feedback
Transfer Effect of Speech-sound Learning on Auditory-motor Processing of Perceived Vocal Pitch Errors
Speech perception and production are intimately linked. There is evidence that speech motor learning results in changes to auditory processing of speech. Whether speech motor control benefits from perceptual learning in speech, however, remains unclear. This event-related potential study investigated whether speech-sound learning can modulate the processing of feedback errors during vocal pitch regulation. Mandarin speakers were trained to perceive five Thai lexical tones while learning to associate pictures with spoken words over 5 days. Before and after training, participants produced sustained vowel sounds while they heard their vocal pitch feedback unexpectedly perturbed. As compared to the pre-training session, the magnitude of vocal compensation significantly decreased for the control group, but remained consistent for the trained group at the post-training session. However, the trained group had smaller and faster N1 responses to pitch perturbations and exhibited enhanced P2 responses that correlated significantly with their learning performance. These findings indicate that the cortical processing of vocal pitch regulation can be shaped by learning new speech-sound associations, suggesting that perceptual learning in speech can produce transfer effects to facilitating the neural mechanisms underlying the online monitoring of auditory feedback regarding vocal production
Temporal Lobe Epilepsy Alters Auditory-motor Integration For Voice Control
Temporal lobe epilepsy (TLE) is the most common drug-refractory focal epilepsy in adults. Previous research has shown that patients with TLE exhibit decreased performance in listening to speech sounds and deficits in the cortical processing of auditory information. Whether TLE compromises auditory-motor integration for voice control, however, remains largely unknown. To address this question, event-related potentials (ERPs) and vocal responses to vocal pitch errors (1/2 or 2 semitones upward) heard in auditory feedback were compared across 28 patients with TLE and 28 healthy controls. Patients with TLE produced significantly larger vocal responses but smaller P2 responses than healthy controls. Moreover, patients with TLE exhibited a positive correlation between vocal response magnitude and baseline voice variability and a negative correlation between P2 amplitude and disease duration. Graphical network analyses revealed a disrupted neuronal network for patients with TLE with a significant increase of clustering coefficients and path lengths as compared to healthy controls. These findings provide strong evidence that TLE is associated with an atypical integration of the auditory and motor systems for vocal pitch regulation, and that the functional networks that support the auditory-motor processing of pitch feedback errors differ between patients with TLE and healthy controls
Learning Edge Representations via Low-Rank Asymmetric Projections
We propose a new method for embedding graphs while preserving directed edge
information. Learning such continuous-space vector representations (or
embeddings) of nodes in a graph is an important first step for using network
information (from social networks, user-item graphs, knowledge bases, etc.) in
many machine learning tasks.
Unlike previous work, we (1) explicitly model an edge as a function of node
embeddings, and we (2) propose a novel objective, the "graph likelihood", which
contrasts information from sampled random walks with non-existent edges.
Individually, both of these contributions improve the learned representations,
especially when there are memory constraints on the total size of the
embeddings. When combined, our contributions enable us to significantly improve
the state-of-the-art by learning more concise representations that better
preserve the graph structure.
We evaluate our method on a variety of link-prediction task including social
networks, collaboration networks, and protein interactions, showing that our
proposed method learn representations with error reductions of up to 76% and
55%, on directed and undirected graphs. In addition, we show that the
representations learned by our method are quite space efficient, producing
embeddings which have higher structure-preserving accuracy but are 10 times
smaller
- …
