293 research outputs found

    UWSpeech: Speech to Speech Translation for Unwritten Languages

    Full text link
    Existing speech to speech translation systems heavily rely on the text of target language: they usually translate source language either to target text and then synthesize target speech from text, or directly to target speech with target text for auxiliary training. However, those methods cannot be applied to unwritten target languages, which have no written text or phoneme available. In this paper, we develop a translation system for unwritten languages, named as UWSpeech, which converts target unwritten speech into discrete tokens with a converter, and then translates source-language speech into target discrete tokens with a translator, and finally synthesizes target speech from target discrete tokens with an inverter. We propose a method called XL-VAE, which enhances vector quantized variational autoencoder (VQ-VAE) with cross-lingual (XL) speech recognition, to train the converter and inverter of UWSpeech jointly. Experiments on Fisher Spanish-English conversation translation dataset show that UWSpeech outperforms direct translation and VQ-VAE baseline by about 16 and 10 BLEU points respectively, which demonstrate the advantages and potentials of UWSpeech

    Multi-objective analysis of the co-mitigation of CO2 and PM2.5 pollution by China's iron and steel industry

    Get PDF
    China has experienced serious fine particulate matter (PM2.5) pollution in recent years, and carbon dioxide (CO2) emissions must be controlled so that China can keep its pledge to reduce CO2 emissions by 2030. The iron and steel industry is energy intensive and contributes significantly to PM2.5 pollution in China. The simultaneous reduction of CO2 emissions and PM2.5 pollution while minimizing the total mitigation costs remains a crucial issue that must be resolved. Using a multi-objective analysis, we compared potential technology combinations based on various policy preferences and targets. Our results showed that policies designed to mitigate PM2.5 pollution have substantial co-benefits for CO2 emissions reductions. However, policies focused solely on reducing CO2 emissions fail to effectively reduce PM2.5. Furthermore, CO2 emissions reductions correspond to large financial costs, whereas PM2.5 pollution reductions are less expensive. Our results suggest that under limited budgets, decision makers should prioritize PM2.5 reductions because CO2 reductions may be simultaneously achieved. Achieving large decreases in CO2 emissions will require further technological innovations to reduce the cost threshold. Thus, China should focus on reducing PM pollution in the short term and prepare for the expected challenges associated with CO2 reductions in the future

    Lipidomics analysis facilitate insight into the molecular mechanisms of urate nephropathy in a gout model induced by combination of MSU crystals injection and high-fat diet feeding

    Get PDF
    Renal injury is one of the most common clinical manifestations of patients with hyperuricaemia/gout. The precise pathophysiological mechanism(s) for the renal injury is still unknown. Furthermore, it is also unclear whether the clinical therapies (e.g., colchicine and febuxostat) could prevent its progression or not. Lipids are involved in almost all of important biological processes and play critical roles in maintaining the renal functions. Herein, shotgun lipidomics was performed for class-targeted lipid analysis of cellular lipidomes in renal tissue of a gouty model induced by combination of monosodium urate crystals injection and high-fat diet feeding with/without treatment with either colchicine or febuxostat. Serum uric acid (UA), proinflammatory cytokines (i.e., TNF-α and IL-6), xanthine oxidase activity, footpad swelling, and pain threshold were determined to evaluate the gouty severity. Renal histopathological changes, blood urea nitrogen, creatinine, and kidney index were used to reflect renal injury. Lipidomics analysis revealed that altered triacylglycerol (TAG) profile, impaired mitochondrial function resulted by decreased tetra 18:2 cardiolipin, reduced 4-hydroxyalkenal (HNE) species, and elevated lysophospholipids were already present in the kidneys at early stage of renal injury, probably contributing to its occurrence and development. In addition to significantly reduce the UA level and relief the gouty severity, treatment with either colchicine or febuxostat could restore HNE bioavailability, thereby delaying the progression of renal injury. However, both of them could not recover the altered TAG profile and the impaired mitochondrial function, indicating that treatment with either of them could not completely prevent the development of renal injury in the gouty model

    ReLyMe: Improving Lyric-to-Melody Generation by Incorporating Lyric-Melody Relationships

    Full text link
    Lyric-to-melody generation, which generates melody according to given lyrics, is one of the most important automatic music composition tasks. With the rapid development of deep learning, previous works address this task with end-to-end neural network models. However, deep learning models cannot well capture the strict but subtle relationships between lyrics and melodies, which compromises the harmony between lyrics and generated melodies. In this paper, we propose ReLyMe, a method that incorporates Relationships between Lyrics and Melodies from music theory to ensure the harmony between lyrics and melodies. Specifically, we first introduce several principles that lyrics and melodies should follow in terms of tone, rhythm, and structure relationships. These principles are then integrated into neural network lyric-to-melody models by adding corresponding constraints during the decoding process to improve the harmony between lyrics and melodies. We use a series of objective and subjective metrics to evaluate the generated melodies. Experiments on both English and Chinese song datasets show the effectiveness of ReLyMe, demonstrating the superiority of incorporating lyric-melody relationships from the music domain into neural lyric-to-melody generation.Comment: Accepted by ACMMM 2022, ora

    How Spin Relaxes and Dephases in Bulk Halide Perovskites

    Full text link
    Spintronics in halide perovskites has drawn significant attention in recent years, due to highly tunable spin-orbit fields and intriguing interplay with lattice symmetry. Spin lifetime -- a key parameter that determines the applicability of materials for spintronics and spin-based quantum information applications -- has been extensively measured in halide perovskites, but not yet assessed from first-principles calculations. Here, we leverage our recently-developed \emph{ab initio} density-matrix dynamics framework to compute the spin relaxation time (T1T_{1}) and ensemble spin dephasing time (T2∗T_{2}^{*}) in a prototype halide perovskite, namely CsPbBr3_{3} with self-consistent spin-orbit coupling (SOC) and quantum descriptions of the electron scattering processes. We also implement the Land\'e gg-factor for solids from first principles and take it into account in our dynamics, which is required to accurately capture spin dephasing at external magnetic fields. We thereby predict intrinsic spin lifetimes as an upper bound for experiments, identify the dominant spin relaxation pathways, and evaluate the dependence on temperature, external fields, carrier density,and impurities. Importantly, we find that the Fr{\"o}hlich interaction that dominates carrier relaxation contributes negligibly to spin relaxation, consistent with the spin-conserving nature of this interaction. We investigated the effect of spin-orbit field with inversion asymmetry on spin lifetime, and we demonstrated from our calculation, persistent spin helix can enhance spin lifetime when the spin-split is large, but it can not be realized by Rashba SOC. Our theoretical approach may lead to new strategies to optimize spin and carrier transport properties in spintronics and quantum information applications.Comment: 10 pages, 6 figure

    MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation

    Full text link
    Pre-trained language models have achieved impressive results in various music understanding and generation tasks. However, existing pre-training methods for symbolic melody generation struggle to capture multi-scale, multi-dimensional structural information in note sequences, due to the domain knowledge discrepancy between text and music. Moreover, the lack of available large-scale symbolic melody datasets limits the pre-training improvement. In this paper, we propose MelodyGLM, a multi-task pre-training framework for generating melodies with long-term structure. We design the melodic n-gram and long span sampling strategies to create local and global blank infilling tasks for modeling the local and global structures in melodies. Specifically, we incorporate pitch n-grams, rhythm n-grams, and their combined n-grams into the melodic n-gram blank infilling tasks for modeling the multi-dimensional structures in melodies. To this end, we have constructed a large-scale symbolic melody dataset, MelodyNet, containing more than 0.4 million melody pieces. MelodyNet is utilized for large-scale pre-training and domain-specific n-gram lexicon construction. Both subjective and objective evaluations demonstrate that MelodyGLM surpasses the standard and previous pre-training methods. In particular, subjective evaluations show that, on the melody continuation task, MelodyGLM gains average improvements of 0.82, 0.87, 0.78, and 0.94 in consistency, rhythmicity, structure, and overall quality, respectively. Notably, MelodyGLM nearly matches the quality of human-composed melodies on the melody inpainting task

    WuYun: Exploring hierarchical skeleton-guided melody generation using knowledge-enhanced deep learning

    Full text link
    Although deep learning has revolutionized music generation, existing methods for structured melody generation follow an end-to-end left-to-right note-by-note generative paradigm and treat each note equally. Here, we present WuYun, a knowledge-enhanced deep learning architecture for improving the structure of generated melodies, which first generates the most structurally important notes to construct a melodic skeleton and subsequently infills it with dynamically decorative notes into a full-fledged melody. Specifically, we use music domain knowledge to extract melodic skeletons and employ sequence learning to reconstruct them, which serve as additional knowledge to provide auxiliary guidance for the melody generation process. We demonstrate that WuYun can generate melodies with better long-term structure and musicality and outperforms other state-of-the-art methods by 0.51 on average on all subjective evaluation metrics. Our study provides a multidisciplinary lens to design melodic hierarchical structures and bridge the gap between data-driven and knowledge-based approaches for numerous music generation tasks

    Total genetic contribution assessment across the human genome

    Get PDF
    Quantifying the overall magnitude of every single locus' genetic effect on the widely measured human phenome is of great challenge. We introduce a unified modelling technique that can consistently provide a total genetic contribution assessment (TGCA) of a gene or genetic variant without thresholding genetic association signals. Genome-wide TGCA in five UK Biobank phenotype domains highlights loci such as the HLA locus for medical conditions, the bone mineral density locus WNT16 for physical measures, and the skin tanning locus MC1R and smoking behaviour locus CHRNA3 for lifestyle. Tissue-specificity investigation reveals several tissues associated with total genetic contributions, including the brain tissues for mental health. Such associations are driven by tissue-specific gene expressions, which share genetic basis with the total genetic contributions. TGCA can provide a genome-wide atlas for the overall genetic contributions in each particular domain of human complex traits. Quantifying the effects of individual loci on the human phenome is a challenging task. Here, the authors introduce a modelling technique, TGCA, that assesses total genetic contribution per locus and apply this to UK Biobank phenotype domains, revealing top loci and links to tissue-specific gene expression
    • …
    corecore