780 research outputs found

    Understanding Spoken Language Development of Children with ASD Using Pre-trained Speech Embeddings

    Full text link
    Speech processing techniques are useful for analyzing speech and language development in children with Autism Spectrum Disorder (ASD), who are often varied and delayed in acquiring these skills. Early identification and intervention are crucial, but traditional assessment methodologies such as caregiver reports are not adequate for the requisite behavioral phenotyping. Natural Language Sample (NLS) analysis has gained attention as a promising complement. Researchers have developed benchmarks for spoken language capabilities in children with ASD, obtainable through the analysis of NLS. This paper proposes applications of speech processing technologies in support of automated assessment of children's spoken language development by classification between child and adult speech and between speech and nonverbal vocalization in NLS, with respective F1 macro scores of 82.6% and 67.8%, underscoring the potential for accurate and scalable tools for ASD research and clinical use.Comment: Accepted to Interspeech 2023, 5 page

    Effective Spoken Language Labeling with Deep Recurrent Neural Networks

    Full text link
    Understanding spoken language is a highly complex problem, which can be decomposed into several simpler tasks. In this paper, we focus on Spoken Language Understanding (SLU), the module of spoken dialog systems responsible for extracting a semantic interpretation from the user utterance. The task is treated as a labeling problem. In the past, SLU has been performed with a wide variety of probabilistic models. The rise of neural networks, in the last couple of years, has opened new interesting research directions in this domain. Recurrent Neural Networks (RNNs) in particular are able not only to represent several pieces of information as embeddings but also, thanks to their recurrent architecture, to encode as embeddings relatively long contexts. Such long contexts are in general out of reach for models previously used for SLU. In this paper we propose novel RNNs architectures for SLU which outperform previous ones. Starting from a published idea as base block, we design new deep RNNs achieving state-of-the-art results on two widely used corpora for SLU: ATIS (Air Traveling Information System), in English, and MEDIA (Hotel information and reservation in France), in French.Comment: 8 pages. Rejected from IJCAI 2017, good remarks overall, but slightly off-topic as from global meta-reviews. Recommendations: 8, 6, 6, 4. arXiv admin note: text overlap with arXiv:1706.0174

    Deafness: Disability or Culture? Best Practices Regarding Controversial Interventions for Deaf and Hard of Hearing Students

    Get PDF
    Background: Many people in the deaf community view deafness as a distinct culture, with its own unique language and history. They reject the use of assistive technologies which can restore hearing for themselves and their children. However, some members of the medical and legal communities consider it unethical to deprive a child of these interventions. Learn more about this emerging conflict, as well as best practices for working with deaf and hard of hearing students in a school environment. Methods: Peer-reviewed journals and popular publications were consulted to gather information about attitudes towards interventions such as the cochlear implant from members of the deaf community, as well the legal and medical communities. Education journals were consulted to gather information about best practices when working with deaf and hard of hearing students. Results: There are strong opinions on both sides of this issue, with various arguments being made both for and against the use of interventions like the cochlear implant. From the perspective of K-12 educators and school counselors, making sure that students feel safe and supported at school. Conclusions: It is not necessary for K-12 educators and school counselors to have opinions on specific assistive technologies. It is important for them to be aware of best practices for working with deaf and hard of hearing students, and to support and respect the decisions of deaf families with regards to their culture.https://scholarscompass.vcu.edu/gradposters/1117/thumbnail.jp

    A hierarchy of linguistic predictions during natural language comprehension

    Get PDF
    Understanding spoken language requires transforming ambiguous acoustic streams into a hierarchy of representations, from phonemes to meaning. It has been suggested that the brain uses prediction to guide the interpretation of incoming input. However, the role of prediction in language processing remains disputed, with disagreement about both the ubiquity and representational nature of predictions. Here, we address both issues by analyzing brain recordings of participants listening to audiobooks, and using a deep neural network (GPT-2) to precisely quantify contextual predictions. First, we establish that brain responses to words are modulated by ubiquitous predictions. Next, we disentangle model-based predictions into distinct dimensions, revealing dissociable neural signatures of predictions about syntactic category (parts of speech), phonemes, and semantics. Finally, we show that high-level (word) predictions inform low-level (phoneme) predictions, supporting hierarchical predictive processing. Together, these results underscore the ubiquity of prediction in language processing, showing that the brain spontaneously predicts upcoming language at multiple levels of abstraction

    Symbolic inductive bias for visually grounded learning of spoken language

    Full text link
    A widespread approach to processing spoken language is to first automatically transcribe it into text. An alternative is to use an end-to-end approach: recent works have proposed to learn semantic embeddings of spoken language from images with spoken captions, without an intermediate transcription step. We propose to use multitask learning to exploit existing transcribed speech within the end-to-end setting. We describe a three-task architecture which combines the objectives of matching spoken captions with corresponding images, speech with text, and text with images. We show that the addition of the speech/text task leads to substantial performance improvements on image retrieval when compared to training the speech/image task in isolation. We conjecture that this is due to a strong inductive bias transcribed speech provides to the model, and offer supporting evidence for this.Comment: ACL 201

    Anxiety Factors of Students’ Emotional Disposition to Professional Communication in Foreign Languages

    Get PDF
    Abstract. The article deals with the problem of the development of personality’s psychological disposition to professional foreign languages communication. The present research aims to define the concept of personality’s emotional and volitional disposition and reveal its essence and study the formation features of the above mentioned disposition of students of different specialties. The study focuses on the causes and dynamics of the students' emotional and volitional disposition to professional communication in foreign languages. Emotional disposition is viewed as personality’s ability to adjust his/her behavior and activity in any professional situations by means of foreign language communication. The research results revealed certain difficulties, and therefore the students’ negative experiences in foreign languages communication, which are determined by a high degree of speech fluency, difficulty in understanding spoken language and grasping the meaning of an utterance. The article introduces the reasons for the development of a special program to form students’ emotional and volitional disposition to professional communication in foreign languages. The program is to contain a set of lessons, assignments and trainings aimed at developing the appropriate volitional qualities, psychic self-regulation skills of a personality

    GLR-Parsing of Word Lattices Using a Beam Search Method

    Get PDF
    This paper presents an approach that allows the efficient integration of speech recognition and language understanding using Tomita's generalized LR-parsing algorithm. For this purpose the GLRP-algorithm is revised so that an agenda mechanism can be used to control the flow of computation of the parsing process. This new approach is used to integrate speech recognition and speech understanding incrementally with a beam search method. These considerations have been implemented and tested on ten word lattices.Comment: 4 pages, 61K postscript, compressed, uuencoded, Eurospeech 9/95, Madri

    Heritability of non-speech auditory processing skills

    Get PDF
    Recent insight into the genetic bases for autism spectrum disorder, dyslexia, stuttering, and language disorders suggest that neurogenetic approaches may also reveal at least one etiology of auditory processing disorder (APD). A person with an APD typically has difficulty understanding speech in background noise despite having normal pure-tone hearing sensitivity. The estimated prevalence of APD may be as high as 10% in the pediatric population, yet the causes are unknown and have not been explored by molecular or genetic approaches. The aim of our study was to determine the heritability of frequency and temporal resolution for auditory signals and speech recognition in noise in 96 identical or fraternal twin pairs, aged 6–11 years. Measures of auditory processing (AP) of non-speech sounds included backward masking (temporal resolution), notched noise masking (spectral resolution), pure-tone frequency discrimination (temporal fine structure sensitivity), and nonsense syllable recognition in noise. We provide evidence of significant heritability, ranging from 0.32 to 0.74, for individual measures of these non-speech-based AP skills that are crucial for understanding spoken language. Identification of specific heritable AP traits such as these serve as a basis to pursue the genetic underpinnings of APD by identifying genetic variants associated with common AP disorders in children and adults
    • …
    corecore