837,955 research outputs found
Recommended from our members
A Talk on the Wild Side: The Direct and Indirect Impact of Speech Recognition on Learning Gains
Research in the learning sciences and mathematics education has suggested that ‘thinking aloud’ (verbalization) can be important for learning. In a technology-mediated learning environment, speech might also help to promote learning by enabling the system to infer the students’ cognitive and affective state so that they can be provided a
sequence of tasks and formative feedback, both of which are adapted to their needs. For these and associated reasons, we developed the iTalk2Learn platform that includes speech production and speech recognition for children learning about fractions. We investigated the impact of iTalk2Learn’s speech functionality in classrooms in the UK and Germany, with our results indicating that a speech-enabled learning environment has the potential to enhance student learning gains and engagement, both directly and indirectly
Learning Fault-tolerant Speech Parsing with SCREEN
This paper describes a new approach and a system SCREEN for fault-tolerant
speech parsing. SCREEEN stands for Symbolic Connectionist Robust EnterprisE for
Natural language. Speech parsing describes the syntactic and semantic analysis
of spontaneous spoken language. The general approach is based on incremental
immediate flat analysis, learning of syntactic and semantic speech parsing,
parallel integration of current hypotheses, and the consideration of various
forms of speech related errors. The goal for this approach is to explore the
parallel interactions between various knowledge sources for learning
incremental fault-tolerant speech parsing. This approach is examined in a
system SCREEN using various hybrid connectionist techniques. Hybrid
connectionist techniques are examined because of their promising properties of
inherent fault tolerance, learning, gradedness and parallel constraint
integration. The input for SCREEN is hypotheses about recognized words of a
spoken utterance potentially analyzed by a speech system, the output is
hypotheses about the flat syntactic and semantic analysis of the utterance. In
this paper we focus on the general approach, the overall architecture, and
examples for learning flat syntactic speech parsing. Different from most other
speech language architectures SCREEN emphasizes an interactive rather than an
autonomous position, learning rather than encoding, flat analysis rather than
in-depth analysis, and fault-tolerant processing of phonetic, syntactic and
semantic knowledge.Comment: 6 pages, postscript, compressed, uuencoded to appear in Proceedings
of AAAI 9
The neural correlates of speech motor sequence learning
Speech is perhaps the most sophisticated example of a species-wide movement capability in the animal kingdom, requiring split-second sequencing of approximately 100 muscles in the respiratory, laryngeal, and oral movement systems. Despite the unique role speech plays in human interaction and the debilitating impact of its disruption, little is known about the neural mechanisms underlying speech motor learning. Here, we studied the behavioral and neural correlates of learning new speech motor sequences. Participants repeatedly produced novel, meaningless syllables comprising illegal consonant clusters (e.g., GVAZF) over 2 days of practice. Following practice, participants produced the sequences with fewer errors and shorter durations, indicative of motor learning. Using fMRI, we compared brain activity during production of the learned illegal sequences and novel illegal sequences. Greater activity was noted during production of novel sequences in brain regions linked to non-speech motor sequence learning, including the BG and pre-SMA. Activity during novel sequence production was also greater in brain regions associated with learning and maintaining speech motor programs, including lateral premotor cortex, frontal operculum, and posterior superior temporal cortex. Measures of learning success correlated positively with activity in left frontal operculum and white matter integrity under left posterior superior temporal sulcus. These findings indicate speech motor sequence learning relies not only on brain areas involved generally in motor sequencing learning but also those associated with feedback-based speech motor learning. Furthermore, learning success is modulated by the integrity of structural connectivity between these motor and sensory brain regions.R01 DC007683 - NIDCD NIH HHS; R01DC007683 - NIDCD NIH HH
A role for the developing lexicon in phonetic category acquisition
Infants segment words from fluent speech during the same period when they are learning phonetic categories, yet accounts of phonetic category acquisition typically ignore information about the words in which sounds appear. We use a Bayesian model to illustrate how feedback from segmented words might constrain phonetic category learning by providing information about which sounds occur together in words. Simulations demonstrate that word-level information can successfully disambiguate overlapping English vowel categories. Learning patterns in the model are shown to parallel human behavior from artificial language learning tasks. These findings point to a central role for the developing lexicon in phonetic category acquisition and provide a framework for incorporating top-down constraints into models of category learning
Simulating dysarthric speech for training data augmentation in clinical speech applications
Training machine learning algorithms for speech applications requires large,
labeled training data sets. This is problematic for clinical applications where
obtaining such data is prohibitively expensive because of privacy concerns or
lack of access. As a result, clinical speech applications are typically
developed using small data sets with only tens of speakers. In this paper, we
propose a method for simulating training data for clinical applications by
transforming healthy speech to dysarthric speech using adversarial training. We
evaluate the efficacy of our approach using both objective and subjective
criteria. We present the transformed samples to five experienced
speech-language pathologists (SLPs) and ask them to identify the samples as
healthy or dysarthric. The results reveal that the SLPs identify the
transformed speech as dysarthric 65% of the time. In a pilot classification
experiment, we show that by using the simulated speech samples to balance an
existing dataset, the classification accuracy improves by about 10% after data
augmentation.Comment: Will appear in Proc. of ICASSP 201
POS Tagging and its Applications for Mathematics
Content analysis of scientific publications is a nontrivial task, but a
useful and important one for scientific information services. In the Gutenberg
era it was a domain of human experts; in the digital age many machine-based
methods, e.g., graph analysis tools and machine-learning techniques, have been
developed for it. Natural Language Processing (NLP) is a powerful
machine-learning approach to semiautomatic speech and language processing,
which is also applicable to mathematics. The well established methods of NLP
have to be adjusted for the special needs of mathematics, in particular for
handling mathematical formulae. We demonstrate a mathematics-aware part of
speech tagger and give a short overview about our adaptation of NLP methods for
mathematical publications. We show the use of the tools developed for key
phrase extraction and classification in the database zbMATH
- …