70 research outputs found

    Notes on the use of variational autoencoders for speech and audio spectrogram modeling

    Get PDF
    International audienceVariational autoencoders (VAEs) are powerful (deep) generative artificial neural networks. They have been recently used in several papers for speech and audio processing, in particular for the modeling of speech/audio spectrograms. In these papers, very poor theoretical support is given to justify the chosen data representation and decoder likelihood function or the corresponding cost function used for training the VAE. Yet, a nice theoretical statistical framework exists and has been extensively presented and discussed in papers dealing with nonnegative matrix factorization (NMF) of audio spectrograms and its application to audio source separation. In the present paper, we show how this statistical framework applies to VAE-based speech/audio spectrogram modeling. This provides the latter insights on the choice and interpretability of data representation and model parameterization

    LTBP2 null mutations in an autosomal recessive ocular syndrome with megalocornea, spherophakia, and secondary glaucoma

    Get PDF
    The latent TGFÎČ-binding proteins (LTBPs) and fibrillins are a superfamily of large, multidomain proteins with structural and TGFÎČ-signalling roles in the extracellular matrix. Their importance is underscored by fibrillin-1 mutations responsible for Marfan syndrome, but their respective roles are still incompletely understood. We report here on two families where children from healthy, consanguineous parents, presented with megalocornea and impaired vision associated with small, round, dislocated lenses (microspherophakia and ectopia lentis) and myopia, as well as a high-arched palate, and, in older children, tall stature with an abnormally large arm span over body height ratio, that is, associated features of Marfan syndrome. Glaucoma was not present at birth, but was diagnosed in older children. Whole genome homozygosity mapping followed by candidate gene analysis identified homozygous truncating mutations of LTBP2 gene in patients from both families. Fibroblast mRNA analysis was consistent with nonsense-mediated mRNA decay, with no evidence of mutated exon skipping. We conclude that biallelic null LTBP2 mutations cause the ocular phenotype in both families and could lead to Marfan-like features in older children. We suggest that intraocular pressures should be followed-up in young children with an ocular phenotype consisting of megalocornea, spherophakia and/or lens dislocation, and recommend LTBP2 gene analysis in these patients

    16p11.2 600 kb Duplications confer risk for typical and atypical Rolandic epilepsy

    Get PDF
    Rolandic epilepsy (RE) is the most common idiopathic focal childhood epilepsy. Its molecular basis is largely unknown and a complex genetic etiology is assumed in the majority of affected individuals. The present study tested whether six large recurrent copy number variants at 1q21, 15q11.2, 15q13.3, 16p11.2, 16p13.11 and 22q11.2 previously associated with neurodevelopmental disorders also increase risk of RE. Our association analyses revealed a significant excess of the 600 kb genomic duplication at the 16p11.2 locus (chr16: 29.5-30.1 Mb) in 393 unrelated patients with typical (n = 339) and atypical (ARE; n = 54) RE compared with the prevalence in 65 046 European population controls (5/393 cases versus 32/65 046 controls; Fisher's exact test P = 2.83 × 10−6, odds ratio = 26.2, 95% confidence interval: 7.9-68.2). In contrast, the 16p11.2 duplication was not detected in 1738 European epilepsy patients with either temporal lobe epilepsy (n = 330) and genetic generalized epilepsies (n = 1408), suggesting a selective enrichment of the 16p11.2 duplication in idiopathic focal childhood epilepsies (Fisher's exact test P = 2.1 × 10−4). In a subsequent screen among children carrying the 16p11.2 600 kb rearrangement we identified three patients with RE-spectrum epilepsies in 117 duplication carriers (2.6%) but none in 202 carriers of the reciprocal deletion. Our results suggest that the 16p11.2 duplication represents a significant genetic risk factor for typical and atypical R

    Cabbage and fermented vegetables : From death rate heterogeneity in countries to candidates for mitigation strategies of severe COVID-19

    Get PDF
    Large differences in COVID-19 death rates exist between countries and between regions of the same country. Some very low death rate countries such as Eastern Asia, Central Europe, or the Balkans have a common feature of eating large quantities of fermented foods. Although biases exist when examining ecological studies, fermented vegetables or cabbage have been associated with low death rates in European countries. SARS-CoV-2 binds to its receptor, the angiotensin-converting enzyme 2 (ACE2). As a result of SARS-CoV-2 binding, ACE2 downregulation enhances the angiotensin II receptor type 1 (AT(1)R) axis associated with oxidative stress. This leads to insulin resistance as well as lung and endothelial damage, two severe outcomes of COVID-19. The nuclear factor (erythroid-derived 2)-like 2 (Nrf2) is the most potent antioxidant in humans and can block in particular the AT(1)R axis. Cabbage contains precursors of sulforaphane, the most active natural activator of Nrf2. Fermented vegetables contain many lactobacilli, which are also potent Nrf2 activators. Three examples are: kimchi in Korea, westernized foods, and the slum paradox. It is proposed that fermented cabbage is a proof-of-concept of dietary manipulations that may enhance Nrf2-associated antioxidant effects, helpful in mitigating COVID-19 severity.Peer reviewe

    Nrf2-interacting nutrients and COVID-19 : time for research to develop adaptation strategies

    Get PDF
    There are large between- and within-country variations in COVID-19 death rates. Some very low death rate settings such as Eastern Asia, Central Europe, the Balkans and Africa have a common feature of eating large quantities of fermented foods whose intake is associated with the activation of the Nrf2 (Nuclear factor (erythroid-derived 2)-like 2) anti-oxidant transcription factor. There are many Nrf2-interacting nutrients (berberine, curcumin, epigallocatechin gallate, genistein, quercetin, resveratrol, sulforaphane) that all act similarly to reduce insulin resistance, endothelial damage, lung injury and cytokine storm. They also act on the same mechanisms (mTOR: Mammalian target of rapamycin, PPAR gamma:Peroxisome proliferator-activated receptor, NF kappa B: Nuclear factor kappa B, ERK: Extracellular signal-regulated kinases and eIF2 alpha:Elongation initiation factor 2 alpha). They may as a result be important in mitigating the severity of COVID-19, acting through the endoplasmic reticulum stress or ACE-Angiotensin-II-AT(1)R axis (AT(1)R) pathway. Many Nrf2-interacting nutrients are also interacting with TRPA1 and/or TRPV1. Interestingly, geographical areas with very low COVID-19 mortality are those with the lowest prevalence of obesity (Sub-Saharan Africa and Asia). It is tempting to propose that Nrf2-interacting foods and nutrients can re-balance insulin resistance and have a significant effect on COVID-19 severity. It is therefore possible that the intake of these foods may restore an optimal natural balance for the Nrf2 pathway and may be of interest in the mitigation of COVID-19 severity

    COVID-19 symptoms at hospital admission vary with age and sex: results from the ISARIC prospective multinational observational study

    Get PDF
    Background: The ISARIC prospective multinational observational study is the largest cohort of hospitalized patients with COVID-19. We present relationships of age, sex, and nationality to presenting symptoms. Methods: International, prospective observational study of 60 109 hospitalized symptomatic patients with laboratory-confirmed COVID-19 recruited from 43 countries between 30 January and 3 August 2020. Logistic regression was performed to evaluate relationships of age and sex to published COVID-19 case definitions and the most commonly reported symptoms. Results: ‘Typical’ symptoms of fever (69%), cough (68%) and shortness of breath (66%) were the most commonly reported. 92% of patients experienced at least one of these. Prevalence of typical symptoms was greatest in 30- to 60-year-olds (respectively 80, 79, 69%; at least one 95%). They were reported less frequently in children (≀ 18 years: 69, 48, 23; 85%), older adults (≄ 70 years: 61, 62, 65; 90%), and women (66, 66, 64; 90%; vs. men 71, 70, 67; 93%, each P < 0.001). The most common atypical presentations under 60 years of age were nausea and vomiting and abdominal pain, and over 60 years was confusion. Regression models showed significant differences in symptoms with sex, age and country. Interpretation: This international collaboration has allowed us to report reliable symptom data from the largest cohort of patients admitted to hospital with COVID-19. Adults over 60 and children admitted to hospital with COVID-19 are less likely to present with typical symptoms. Nausea and vomiting are common atypical presentations under 30 years. Confusion is a frequent atypical presentation of COVID-19 in adults over 60 years. Women are less likely to experience typical symptoms than men

    Evaluation of Audio FeatureExtraction Techniques to ClassifySynthesizer Sounds

    No full text
    After many years focused on speech signal processing, the research in audio processing started to investigate the field of music processing. Music Information Retrieval is a very new topic steadily growing since a few years as music is more and more part of our daily life, particularly thanks to the new technologies like mp3 players and smartphones. Moreover, with the development of electronic music and the huge improvements in computational power, new instruments have appeared such as virtual instruments, bringing with them new needs concerning the availability of sounds. One main necessity which came with these novel technologies is to have a user friendly system to make it easy for the users to have access to the whole range of sounds the device can offer.  In this thesis, the purpose is to implement a smart automatic classification of synthesizer sounds based on audio descriptors without any human influence. Hence the study first focus on what is a musical sound and what are the main characteristics of synthesizer sounds that need to be extracted using wisely chosen audio descriptors extraction. Then the interest moves to a classifier system based on the Self-Organizing Map model using unsupervised learning to match with the main purpose to avoid any human bias and use only objective parameters for the sounds classification. Finally the evaluation of the system is done, showing that it gives good results both in terms of accuracy and time efficiency

    SynthĂšse de sons musicaux par apprentissage machine : vers un espace de contrĂŽle perceptivement pertinent

    No full text
    One of the main challenges of the synthesizer market and the research in sound synthesis nowadays lies in proposing new forms of synthesis allowing the creation of brand new sonorities while offering musicians more intuitive and perceptually meaningful controls to help them reach the perfect sound more easily. Indeed, today's synthesizers are very powerful tools that provide musicians with a considerable amount of possibilities for creating sonic textures, but the control of parameters still lacks user-friendliness and may require some expert knowledge about the underlying generative processes. In this thesis, we are interested in developing and evaluating new data-driven machine learning methods for music sound synthesis allowing the generation of brand new high-quality sounds while providing high-level perceptually meaningful control parameters.The first challenge of this thesis was thus to characterize the musical synthetic timbre by evidencing a set of perceptual verbal descriptors that are both frequently and consensually used by musicians. Two perceptual studies were then conducted: a free verbalization test enabling us to select eight different commonly used terms for describing synthesizer sounds, and a semantic scale analysis enabling us to quantitatively evaluate the use of these terms to characterize a subset of synthetic sounds, as well as analyze how consensual they were.In a second phase, we investigated the use of machine learning algorithms to extract a high-level representation space with interesting interpolation and extrapolation properties from a dataset of sounds, the goal being to relate this space with the perceptual dimensions evidenced earlier. Following previous studies interested in using deep learning for music sound synthesis, we focused on autoencoder models and realized an extensive comparative study of several kinds of autoencoders on two different datasets. These experiments, together with a qualitative analysis made with a non real-time prototype developed during the thesis, allowed us to validate the use of such models, and in particular the use of the variational autoencoder (VAE), as relevant tools for extracting a high-level latent space in which we can navigate smoothly and create new sounds. However, so far, no link between this latent space and the perceptual dimensions evidenced by the perceptual tests emerged naturally.As a final step, we thus tried to enforce perceptual supervision of the VAE by adding a regularization during the training phase. Using the subset of synthetic sounds used in the second perceptual test and the corresponding perceptual grades along the eight perceptual dimensions provided by the semantic scale analysis, it was possible to constraint, to a certain extent, some dimensions of the VAE high-level latent space so as to match these perceptual dimensions. A final comparative test was then conducted in order to evaluate the efficiency of this additional regularization for conditioning the model and (partially) leading to a perceptual control of music sound synthesis.Un des enjeux majeurs du marchĂ© des synthĂ©tiseurs et de la recherche en synthĂšse sonore aujourd'hui est de proposer une nouvelle forme de synthĂšse permettant de gĂ©nĂ©rer des sons inĂ©dits tout en offrant aux utilisateurs de nouveaux contrĂŽles plus intuitifs afin de les aider dans leur recherche de sons. En effet, les synthĂ©tiseurs sont actuellement des outils trĂšs puissants qui offrent aux musiciens une large palette de possibilitĂ©s pour la crĂ©ation de textures sonores, mais Ă©galement souvent trĂšs complexes avec des paramĂštres de contrĂŽle dont la manipulation nĂ©cessite gĂ©nĂ©ralement des connaissances expertes. Cette thĂšse s'intĂ©resse ainsi au dĂ©veloppement et Ă  l'Ă©valuation de nouvelles mĂ©thodes d'apprentissage machine pour la synthĂšse sonore permettant la gĂ©nĂ©ration de nouveaux sons de qualitĂ© tout en fournissant des paramĂštres de contrĂŽle pertinents perceptivement.Le premier challenge que nous avons relevĂ© a donc Ă©tĂ© de caractĂ©riser perceptivement le timbre musical synthĂ©tique en mettant en Ă©vidence un jeu de descripteurs verbaux utilisĂ©s frĂ©quemment et de maniĂšre consensuelle par les musiciens. Deux Ă©tudes perceptives ont Ă©tĂ© menĂ©es : un test de verbalisation libre qui nous a permis de sĂ©lectionner huit termes communĂ©ment utilisĂ©s pour dĂ©crire des sons de synthĂ©tiseurs, et une analyse Ă  Ă©chelles sĂ©mantiques permettant d'Ă©valuer quantitativement l'utilisation de ces termes pour caractĂ©riser un sous-ensemble de sons, ainsi que d'analyser leur "degrĂ© de consensualitĂ©".Dans un second temps, nous avons explorĂ© l'utilisation d'algorithmes d'apprentissage machine pour l'extraction d'un espace de reprĂ©sentation haut-niveau avec des propriĂ©tĂ©s intĂ©ressantes d'interpolation et d'extrapolation Ă  partir d'une base de donnĂ©es de sons, le but Ă©tant de mettre en relation cet espace avec les dimensions perceptives mises en Ă©vidence plus tĂŽt. S'inspirant de prĂ©cĂ©dentes Ă©tudes sur la synthĂšse sonore par apprentissage profond, nous nous sommes concentrĂ©s sur des modĂšles du type autoencodeur et avons rĂ©alisĂ© une Ă©tude comparative approfondie de plusieurs types d'autoencodeurs sur deux jeux de donnĂ©es diffĂ©rents. Ces expĂ©riences, couplĂ©es avec une Ă©tude qualitative via un prototype non temps-rĂ©el dĂ©veloppĂ© durant la thĂšse, nous ont permis de valider les autoencodeurs, et en particulier l'autoencodeur variationnel (VAE), comme des outils bien adaptĂ©s Ă  l'extraction d'un espace latent de haut-niveau dans lequel il est possible de se dĂ©placer de maniĂšre continue et fluide en crĂ©ant de tous nouveaux sons. Cependant, Ă  ce niveau, aucun lien entre cet espace latent et les dimensions perceptives mises en Ă©vidence prĂ©cĂ©demment n'a pu ĂȘtre Ă©tabli spontanĂ©ment.Pour finir, nous avons donc apportĂ© de la supervision au VAE en ajoutant une rĂ©gularisation perceptive durant la phase d'apprentissage. En utilisant les Ă©chantillons sonores rĂ©sultant du test perceptif avec Ă©chelles sĂ©mantiques labellisĂ©s suivant les huit dimensions perceptives, il a Ă©tĂ© possible de contraindre, dans une certaine mesure, certaines dimensions de l'espace latent extrait par le VAE afin qu'elles coĂŻncident avec ces dimensions. Un test comparatif a Ă©tĂ© finalement rĂ©alisĂ© afin d'Ă©valuer l'efficacitĂ© de cette rĂ©gularisation supplĂ©mentaire pour conditionner le modĂšle et permettre un contrĂŽle perceptif (au moins partiel) de la synthĂšse sonore
    • 

    corecore