946 research outputs found

    Decoding speech comprehension from continuous EEG recordings

    Get PDF
    Human language is a remarkable manifestation of our cognitive abilities which is unique to our species. It is key to communication, but also to our faculty of generating complex thoughts. We organise, conceptualise, and share ideas through language. Neuroscience has shed insightful lights on our understanding of how language is processed by the brain although the exact neural organisation, structural or functional, underpinning this processing remains poorly known. This project aims to employ new methodology to understand speech comprehension during naturalistic listening condition. One achievement of this thesis lies in bringing evidence towards putative predictive processing mechanisms for language comprehension and confront those with rule-based grammar processing. Namely, we looked on the one hand at cortical responses to information-theoretic measures that are relevant for predictive coding in the context of language processing and on the other hand to the response to syntactic tree structures. We successfully recorded responses to linguistic features from continuous EEG recordings during naturalistic speech listening. The use of ecologically valid stimuli allowed us to embed neural response in the context in which they naturally occur when hearing speech. This fostered the development of new analysis tools adapted for such experimental designs. Finally, we demonstrate the ability to decode comprehension from the EEG signals of participants with above-chance accuracy. This could be used as a better indicator of the severity and specificity of language disorders, and also to assess if a patient in a vegetative state understands speech without the need for any behavioural response. Hence a primary outcome is our contribution to the neurobiology of language comprehension. Furthermore, our results pave the way to the development of a new range of diagnostic tools to measure speech comprehension of patients with language impairment.Open Acces

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Audio source separation for music in low-latency and high-latency scenarios

    Get PDF
    Aquesta tesi proposa mètodes per tractar les limitacions de les tècniques existents de separació de fonts musicals en condicions de baixa i alta latència. En primer lloc, ens centrem en els mètodes amb un baix cost computacional i baixa latència. Proposem l'ús de la regularització de Tikhonov com a mètode de descomposició de l'espectre en el context de baixa latència. El comparem amb les tècniques existents en tasques d'estimació i seguiment dels tons, que són passos crucials en molts mètodes de separació. A continuació utilitzem i avaluem el mètode de descomposició de l'espectre en tasques de separació de veu cantada, baix i percussió. En segon lloc, proposem diversos mètodes d'alta latència que milloren la separació de la veu cantada, gràcies al modelatge de components específics, com la respiració i les consonants. Finalment, explorem l'ús de correlacions temporals i anotacions manuals per millorar la separació dels instruments de percussió i dels senyals musicals polifònics complexes.Esta tesis propone métodos para tratar las limitaciones de las técnicas existentes de separación de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los métodos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularización de Tikhonov como método de descomposición del espectro en el contexto de baja latencia. Lo comparamos con las técnicas existentes en tareas de estimación y seguimiento de los tonos, que son pasos cruciales en muchos métodos de separación. A continuación utilizamos y evaluamos el método de descomposición del espectro en tareas de separación de voz cantada, bajo y percusión. En segundo lugar, proponemos varios métodos de alta latencia que mejoran la separación de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiración y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separación de los instrumentos de percusión y señales musicales polifónicas complejas.This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals

    Multimodal imaging of language perception

    Get PDF
    This Thesis draws together several lines of research by examining language perception in the same individuals using three neuroimaging methods: magnetoencephalography (MEG), functional magnetic resonance imaging (fMRI), and electroencephalography (EEG). The MEG experiments conducted in this Thesis demonstrated that neural processing of written and spoken words converges to the superior temporal cortex following initial modality-specific analysis. In both reading and speech perception, the superior temporal cortex is involved in processing word meaning at ∼250-450 ms in the left hemisphere and after ∼450 ms bilaterally. The data thus support a view of a largely shared semantic system in auditory and visual language perception, in line with the assumption that reading acquisition makes use of the neural systems originally developed for speech perception during evolution and in individual language development. The MEG experiments on reading morphologically complex words showed that the left superior temporal activation was enhanced for the morphologically complex words at ∼200-700 ms. The results suggest that the majority of inflected words in the highly inflected Finnish language are represented in a decomposed form and that the decomposition process requires additional neural resources. Only very high-frequency inflected words may acquire full-form representations. The MEG results on parafoveal preview in reading indicated that neural processing of written words in the left hemisphere is affected by a preview of words in the right visual field. The underlying neural mechanism may facilitate reading of connected text in natural conditions. In a direct comparison, MEG and fMRI showed diverging activation patterns in a reading task although the same individuals were performing the same task. Based on the similarity of the EEG responses recorded simultaneously with both MEG and fMRI, the participants were performing the task similarly during the two recordings. The divergent MEG and fMRI results cannot be attributed to differences in the experimental procedures or language since these factors were controlled. Rather, they are likely to reflect actual dissimilarities in the way neural activity in a high-level cognitive task is picked up by MEG evoked responses and fMRI signals

    Effects of age and stimulation strategies on cochlear implantation and a clinically feasible method for sound localization latency

    Get PDF
    Treating prelingual deafness with cochlear implants paves the way for spoken language development. Previous studies have shown that providing the intervention at six to 11 months is better than at 12-17 months. However, interventions at even earlier ages have not been researched to the same extent, for example by comparing five to eight months with nine to 11 months. That is why we retrospectively assessed the surgical risks, and analyzed the longitudinal spoken language tests, of 103 children who received their first cochlear implant between five and 30 months of age. This research particularly focused on surgery before 12 months of age (Paper I). Apart from language development, we expected that early implants would provide access to the interaural time differences that are crucial for localizing low frequency sounds. We were interested to examine this in combination with novel sound processing strategies with stimulation patterns that convey the fine structure of sounds. Therefore, in addition to the retrospective analysis, we studied the relationships between stimulation strategies, lateralization of interaural time differences and horizontal sound localization in 30 children (Paper II). Then we decided to develop a method to objectively assess sound localization latency to complement localization accuracy. A method that assesses latency needed to be validated in adults with normal hearing, and in hampered conditions, so that the relationship between accuracy and latency could be clarified. In our study, the gaze patterns from the localization recordings were modelled by optimizing a sigmoid function (Paper III). Furthermore, we addressed the lack of studies on the normal development of sound localization latency of gaze responses in infancy and early childhood (Paper IV). Our study of spoken language development showed the benefit of cochlear implantation before nine months of age, compared to nine to 11 months of age, without increased surgical risks. This finding was strongest when it came to the age at which the child’s language could be understood (Paper I). When our group of 30 subjects underwent tests for interaural time differences, 10 were able to discriminate within the range of naturally occurring differences. Interestingly, the choice of stimulation strategy was a prerequisite for lateralizing natural interaural time differences. However, no relationships between this ability to lateralize and the ability to localize low frequency sounds were found (Paper II). The localization setup meant that detailed investigations of gaze behavior could be carried out. Eight normal hearing adults demonstrated a mean sound localization latency of 280 ± 40 milliseconds (ms), with distinct prolongation with unilateral earplugging. It is interesting to observe the similarity in latency, dynamic behavior, and overlap of anatomical structures between the acoustic middle ear reflex and sound localization latency (Paper III). In addition, normal hearing infants showed diminished sound localization latency, from 1000 ms at six months of age down to 500 ms at three years of age (Paper IV). Latency in children with early cochlear implants still needs to be studied. The findings in this thesis have important clinical implications for counseling parents and they provide valuable data to guide clinical choices about the age when cochlear implants are provided and processor programming takes place. The fast, objective and non-invasive method of sound localization latency assessment may further enhance the clinical processes of diagnosing and monitoring interventions in children with hearing impairment
    corecore