12 research outputs found

    On the Effectiveness of Speech Self-supervised Learning for Music

    Full text link
    Self-supervised learning (SSL) has shown promising results in various speech and natural language processing applications. However, its efficacy in music information retrieval (MIR) still remains largely unexplored. While previous SSL models pre-trained on music recordings may have been mostly closed-sourced, recent speech models such as wav2vec2.0 have shown promise in music modelling. Nevertheless, research exploring the effectiveness of applying speech SSL models to music recordings has been limited. We explore the music adaption of SSL with two distinctive speech-related models, data2vec1.0 and Hubert, and refer to them as music2vec and musicHuBERT, respectively. We train 1212 SSL models with 95M parameters under various pre-training configurations and systematically evaluate the MIR task performances with 13 different MIR tasks. Our findings suggest that training with music data can generally improve performance on MIR tasks, even when models are trained using paradigms designed for speech. However, we identify the limitations of such existing speech-oriented designs, especially in modelling polyphonic information. Based on the experimental results, empirical suggestions are also given for designing future musical SSL strategies and paradigms

    MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training

    Full text link
    Self-supervised learning (SSL) has recently emerged as a promising paradigm for training generalisable models on large-scale data in the fields of vision, text, and speech. Although SSL has been proven effective in speech and audio, its application to music audio has yet to be thoroughly explored. This is primarily due to the distinctive challenges associated with modelling musical knowledge, particularly its tonal and pitched characteristics of music. To address this research gap, we propose an acoustic Music undERstanding model with large-scale self-supervised Training (MERT), which incorporates teacher models to provide pseudo labels in the masked language modelling (MLM) style acoustic pre-training. In our exploration, we identified a superior combination of teacher models, which outperforms conventional speech and audio approaches in terms of performance. This combination includes an acoustic teacher based on Residual Vector Quantization - Variational AutoEncoder (RVQ-VAE) and a musical teacher based on the Constant-Q Transform (CQT). These teachers effectively guide our student model, a BERT-style transformer encoder, to better model music audio. In addition, we introduce an in-batch noise mixture augmentation to enhance the representation robustness. Furthermore, we explore a wide range of settings to overcome the instability in acoustic language model pre-training, which allows our designed paradigm to scale from 95M to 330M parameters. Experimental results indicate that our model can generalise and perform well on 14 music understanding tasks and attains state-of-the-art (SOTA) overall scores. The code and models are online: https://github.com/yizhilll/MERT

    Relationship between Novel Anthropometric Indices and the Prevalence of Abdominal Aortic Calcification: A Large Cross-Sectional Study

    No full text
    Background: The relationship between novel anthropometric indices, specifically a body shape index (ABSI) and body roundness index (BRI), with abdominal aortic calcification (AAC) or severe AAC (SAAC) is unclear. The aim of our study was therefore to investigate possible relationships between novel anthropometric indices and prevalence of AAC and SAAC. Methods: We obtained U.S. general population data from the National Health and Nutrition Examination Survey between 2013 and 2014. The study used restricted cubic spline (RCS) analysis, multivariable logistic regression modeling, subgroup analysis, and receiver operating characteristic (ROC) curve assessment. We investigated relationships between ABSI or BRI and AAC and SAAC risk. Associations between ABSI or BRI and the degree of AAC were also evaluated using a generalized additive model. Results: The study cohort was comprised of 1062 individuals. The RCS plots revealed a U-shaped curve associating ABSI with AAC risk. A similar trend emerged for SAAC, where the risk initially increased before subsequently decreasing with rising ABSI levels. Additionally, BRI exhibited a positive correlation with both AAC and SAAC risk. As ABSI and BRI values increased, the degree of AAC also increased. In ROC analysis, ABSI displayed a significantly larger area under the curve compared to BRI. Conclusions: ABSI is associated with AAC prevalence following a U-shaped curve. Additionally, BRI is positively correlated with AAC risk. ABSI demonstrates a superior discriminative ability for AAC compared to BRI. Therefore, maintaining an appropriate ABSI and BRI may reduce the prevalence of AAC

    Switchable Kirigami Structures as Window Envelopes for Energy-Efficient Buildings

    No full text
    Efficient regulation of thermal radiation is an effective way to conserve energy consumption of buildings. Because windows are the least energy-efficient part of buildings, their thermal radiation regulation is highly demanded, especially in the changing environment, but is still a challenge. Here, by employing a kirigami structure, we design a variable-angle thermal reflector as a transparent envelope of windows for their thermal radiation modulation. The envelope can be easily switched between heating and cooling modes by loading different pre-stresses, which endow the envelope windows with the ability of temperature regulation, and the interior temperature of a building model can be reduced by ~3.3 °C under cooling mode and increased by ~3.9 °C under heating mode in the outdoor test. The improved thermal management of windows by the adaptive envelope provides an extra heating, ventilation, and air-conditioning energy savings percentage of 13% to 29% per year for buildings located in different climate zones around the world, making the kirigami envelope windows a promising way for energy-saving utilization
    corecore