295 research outputs found

    Word-level embeddings for cross-task transfer learning in speech processing

    Get PDF
    Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. In recent years, unsupervised and self-supervised techniques for learning speech representation were developed to foster automatic speech recognition. Up to date, most of these approaches are task-specific and designed for within-task transfer learning between different datasets or setups of a particular task. In turn, learning task-independent representation of speech and cross-task applications of transfer learning remain less common. Here, we introduce an encoder capturing word-level representations of speech for cross-task transfer learning. We demonstrate the application of the pre-trained encoder in four distinct speech and audio processing tasks: (i) speech enhancement, (ii) language identification, (iii) speech, noise, and music classification, and (iv) speaker identification. In each task, we compare the performance of our cross-task transfer learning approach to task-specific baselines. Our results show that the speech representation captured by the encoder through the pre-training is transferable across distinct speech processing tasks and datasets. Notably, even simple applications of our pre-trained encoder outperformed task-specific methods, or were comparable, depending on the task

    Electronic properties of very thin native SiO2/a-Si:H interfaces and their comparison with those prepared by both dielectric barrier discharge oxidation at atmospheric pressure and by chemical oxidation

    Get PDF
    The contribution deals with electronic properties of thin oxide/amorphous hydrogenated silicon (a-Si:H) measured by capacitance-voltage (C-V) and charge version of deep level transient spectroscopy (Q-DLTS). The interest was focused on the studies of the interface properties of very thin dielectrics formed by dielectric barrier discharge (DBD) or natively on the a-Si:H layer. These properties were compared with those of oxide layers prepared by chemical oxidation in HNO3. The DBD was used for the preparation of a very thin SiO2 layer on a-Si:H for the first time to our knowledge. Preliminary electrical measurements confirmed that a very low interface states density was detected in the case of the native oxide/a-Si:H and DBD oxide/a-Si:H

    Links between traumatic brain injury and ballistic pressure waves originating in the thoracic cavity and extremities

    Full text link
    Identifying patients at risk of traumatic brain injury (TBI) is important because research suggests prophylactic treatments to reduce risk of long-term sequelae. Blast pressure waves can cause TBI without penetrating wounds or blunt force trauma. Similarly, bullet impacts distant from the brain can produce pressure waves sufficient to cause mild to moderate TBI. The fluid percussion model of TBI shows that pressure impulses of 15-30 psi cause mild to moderate TBI in laboratory animals. In pigs and dogs, bullet impacts to the thigh produce pressure waves in the brain of 18-45 psi and measurable injury to neurons and neuroglia. Analyses of research in goats and epidemiological data from shooting events involving humans show high correlations (r > 0.9) between rapid incapacitation and pressure wave magnitude in the thoracic cavity. A case study has documented epilepsy resulting from a pressure wave without the bullet directly hitting the brain. Taken together, these results support the hypothesis that bullet impacts distant from the brain produce pressure waves that travel to the brain and can retain sufficient magnitude to induce brain injury. The link to long-term sequelae could be investigated via epidemiological studies of patients who were gunshot in the chest to determine whether they experience elevated rates of epilepsy and other neurological sequelae

    Voice Analysis to Differentiate the Dopaminergic Response in People With Parkinson's Disease

    Get PDF
    Humans' voice offers the widest variety of motor phenomena of any human activity. However, its clinical evaluation in people with movement disorders such as Parkinson's disease (PD) lags behind current knowledge on advanced analytical automatic speech processing methodology. Here, we use deep learning-based speech processing to differentially analyze voice recordings in 14 people with PD before and after dopaminergic medication using personalized Convolutional Recurrent Neural Networks (p-CRNN) and Phone Attribute Codebooks (PAC). p-CRNN yields an accuracy of 82.35% in the binary classification of ON and OFF motor states at a sensitivity/specificity of 0.86/0.78. The PAC-based approach's accuracy was slightly lower with 73.08% at a sensitivity/specificity of 0.69/0.77, but this method offers easier interpretation and understanding of the computational biomarkers. Both p-CRNN and PAC provide a differentiated view and novel insights into the distinctive components of the speech of persons with PD. Both methods detect voice qualities that are amenable to dopaminergic treatment, including active phonetic and prosodic features. Our findings may pave the way for quantitative measurements of speech in persons with PD

    Characterisation of voice quality of Parkinson’s disease using differential phonological posterior features

    Get PDF
    Change in voice quality (VQ) is one of the first precursors of Parkinson’s disease (PD). Specifically, impacted phonation and articulation causes the patient to have a breathy, husky-semiwhisper and hoarse voice. A goal of this paper is to characterise a VQ spectrum – the composition of non-modal phonations – of voice in PD. The paper relates non-modal healthy phonations: breathy, creaky, tense, falsetto and harsh, with disordered phonation in PD. First, statistics are learned to differentiate the modal and non-modal phonations. Statistics are computed using phonological posteriors, the probabilities of phonological features inferred from the speech signal using a deep learning approach. Second, statistics of disordered speech are learned from PD speech data comprising 50 patients and 50 healthy controls. Third, Euclidean distance is used to calculate similarity of non-modal and disordered statistics, and the inverse of the distances is used to obtain the composition of non-modal phonation in PD. Thus, pathological voice quality is characterised using healthy non-modal voice quality “base/eigenspace”. The obtained results are interpreted as the voice of an average patient with PD and can be characterised by the voice quality spectrum composed of 30% breathy voice, 23% creaky voice, 20% tense voice, 15% falsetto voice and 12% harsh voice. In addition, the proposed features were applied for prediction of the dysarthria level according to the Frenchay assessment score related to the larynx, and significant improvement is obtained for reading speech task. The proposed characterisation of VQ might also be applied to other kinds of pathological speech

    NeuroSpeech

    Get PDF
    NeuroSpeech is a software for modeling pathological speech signals considering different speech dimensions: phonation, articulation, prosody, and intelligibility. Although it was developed to model dysarthric speech signals from Parkinson's patients, its structure allows other computer scientists or developers to include other pathologies and/or measures. Different tasks can be performed: (1) modeling of the signals considering the aforementioned speech dimensions, (2) automatic discrimination of Parkinson's vs. non-Parkinson's, and (3) prediction of the neurological state according to the Unified Parkinson's Disease Rating Scale (UPDRS) score. The prediction of the dysarthria level according to the Frenchay Dysarthria Assessment scale is also provided

    The next-generation ARC middleware

    Get PDF
    The Advanced Resource Connector (ARC) is a light-weight, non-intrusive, simple yet powerful Grid middleware capable of connecting highly heterogeneous computing and storage resources. ARC aims at providing general purpose, flexible, collaborative computing environments suitable for a range of uses, both in science and business. The server side offers the fundamental job execution management, information and data capabilities required for a Grid. Users are provided with an easy to install and use client which provides a basic toolbox for job- and data management. The KnowARC project developed the next-generation ARC middleware, implemented as Web Services with the aim of standard-compliant interoperability

    Joule-heating Effects In the Amorphous Fe40ni40b20 Alloy

    Get PDF
    The effects of Joule heating on the amorphous Fe40Ni40B20 alloy are investigated by measuring the time behavior of the electrical resistance of ribbon strips during such a treatment. The structural transformations occurring in subsequent stages of the process are studied by means of x-ray-diffraction, differential-scanning-calorimetry, and magnetic-permeability measurements. A continuous evolution from a fully amorphous to a fully crystalline structure may be followed. The crystallization mechanisms observed in Joule-heated samples differ from the ones occurring under conventional heating conditions. The electrical resistance displays a bump in the course of Joule heating. A quantitative model relating such a bump to the extra heat released to the sample by fast crystallization is proposed and discussed

    Shear Forces during Blast, Not Abrupt Changes in Pressure Alone, Generate Calcium Activity in Human Brain Cells

    Get PDF
    Blast-Induced Traumatic Brain Injury (bTBI) describes a spectrum of injuries caused by an explosive force that results in changes in brain function. The mechanism responsible for primary bTBI following a blast shockwave remains unknown. We have developed a pneumatic device that delivers shockwaves, similar to those known to induce bTBI, within a chamber optimal for fluorescence microscopy. Abrupt changes in pressure can be created with and without the presence of shear forces at the surface of cells. In primary cultures of human central nervous system cells, the cellular calcium response to shockwaves alone was negligible. Even when the applied pressure reached 15 atm, there was no damage or excitation, unless concomitant shear forces, peaking between 0.3 to 0.7 Pa, were present at the cell surface. The probability of cellular injury in response to a shockwave was low and cell survival was unaffected 20 hours after shockwave exposure

    Multi-view representation learning via gcca for multimodal analysis of Parkinson's disease

    Get PDF
    Information from different bio-signals such as speech, handwriting, and gait have been used to monitor the state of Parkinson's disease (PD) patients, however, all the multimodal bio-signals may not always be available. We propose a method based on multi-view representation learning via generalized canonical correlation analysis (GCCA) for learning a representation of features extracted from handwriting and gait that can be used as a complement to speech-based features. Three different problems are addressed: classification of PD patients vs. healthy controls, prediction of the neurological state of PD patients according to the UPDRS score, and the prediction of a modified version of the Frenchay dysarthria assessment (m-FDA). According to the results, the proposed approach is suitable to improve the results in the addressed problems, specially in the prediction of the UPDRS, and m-FDA scores
    • …
    corecore