965 research outputs found

    MusCaps: generating captions for music audio

    Get PDF
    Content-based music information retrieval has seen rapid progress with the adoption of deep learning. Current approaches to high-level music description typically make use of classification models, such as in auto tagging or genre and mood classification. In this work, we propose to address music description via audio captioning, defined as the task of generating a natural language description of music audio content in a human-like manner. To this end, we present the first music audio captioning model, MusCaps, consisting of an encoder-decoder with temporal attention. Our method combines convolutional and recurrent neural network architectures to jointly process audio-text inputs through a multimodal encoder and leverages pre-training on audio data to obtain representations that effectively capture and summarise musical features in the input. Evaluation of the generated captions through automatic metrics shows that our method outperforms a baseline designed for non-music audio captioning. Through an ablation study, we unveil that this performance boost can be mainly attributed to pre-training of the audio encoder, while other design choices – modality fusion, decoding strategy and the use of attention -- contribute only marginally. Our model represents a shift away from classification-based music description and combines tasks requiring both auditory and linguistic understanding to bridge the semantic gap in music information retrieval

    Contrastive audio-language learning for music

    Get PDF
    As one of the most intuitive interfaces known to humans, natural language has the potential to mediate many tasks that involve human-computer interaction, especially in application-focused fields like Music Information Retrieval. In this work, we explore cross-modal learning in an attempt to bridge audio and language in the music domain. To this end, we propose MusCALL, a framework for Music Contrastive Audio-Language Learning. Our approach consists of a dual-encoder architecture that learns the alignment between pairs of music audio and descriptive sentences, producing multimodal embeddings that can be used for text-to-audio and audio-to-text retrieval out-of-the-box. Thanks to this property, MusCALL can be transferred to virtually any task that can be cast as text-based retrieval. Our experiments show that our method performs significantly better than the baselines at retrieving audio that matches a textual description and, conversely, text that matches an audio query. We also demonstrate that the multimodal alignment capability of our model can be successfully extended to the zero-shot transfer scenario for genre classification and auto-tagging on two public datasets

    Learning music audio representations via weak language supervision

    Get PDF
    Audio representations for music information retrieval are typically learned via supervised learning in a task-specific fashion. Although effective at producing state-of-the-art results, this scheme lacks flexibility with respect to the range of applications a model can have and requires extensively annotated datasets. In this work, we pose the question of whether it may be possible to exploit weakly aligned text as the only supervisory signal to learn general-purpose music audio representations. To address this question, we design a multimodal architecture for music and language pre-training (MuLaP) optimised via a set of proxy tasks. Weak supervision is provided in the form of noisy natural language descriptions conveying the overall musical content of the track. After pre-training, we transfer the audio backbone of the model to a set of music audio classification and regression tasks. We demonstrate the usefulness of our approach by comparing the performance of audio representations produced by the same audio backbone with different training strategies and show that our pre-training method consistently achieves comparable or higher scores on all tasks and datasets considered. Our experiments also confirm that MuLaP effectively leverages audio-caption pairs to learn representations that are competitive with audio-only and cross-modal self-supervised methods in the literature

    Stratigraphic and petrographic study of the limestones of the La Tomita sector, in the municipality of Manaure-Cesar, Colombia

    Get PDF
    Introduction: In the La Tomita sector, municipality of Manaure (Cesar), outcrop a stratigraphic sequence of biomicrite limestones, biopelmicrites, pelmicrites, wackestones and packstones intercalated with shales, corresponding to the Lagunita Formation of the Cogollo Group.  Objective: To know the stratigraphic aspects, mineralogical composition and paleoenvironmental conditions of the outcropping limestones. Methodology: It proceeded with a lithostratigraphic description in the massif and taking samples in situ, twelve samples were extracted, of which seven were taken for petrographic analysis. Results: Wackestone facies with pelagic microfossils, bioclastic packstone, wackestone with worn mollusc bioclasts and wackestone with peloids were recognized. Petrographically, the limestones in this sector are made up of zircon, glauconite, sparite, micrite, pellets, planktonic foraminifera of the genus Heterohelix, of the Moremani species, and foraminifera of the genus Hedbergella and Trocoidea species. Bivalve fossils and some algae were also observed. Conclusions: These facies allowed establishing that these limestones were formed in an environment of medium platform with some external platform intervals, covering an area of facies of open sea.Introducción: En el sector La Tomita, municipio de Manaure (Cesar), aflora una secuencia estratigráfica de calizas biomicritas, biopelmicritas, pelmicritas, tipo wackestone y packstone intercalados con shales, correspondientes a la Formación Lagunita del Grupo Cogollo. Objetivo: Conocer los aspectos estratigráficos, composición mineralógica y condiciones paleoambientales de las calizas aflorantes Metodología: se procedió con una descripción litoestratigráfica en el macizo y toma de muestras in situ. Se extrajeron doce muestras, de la cuales se tomaron siete para análisis petrográfico. Resultados: Se reconocieron facies de wackestone con microfósiles pelágicos, packstone bioclásticos, wackestone con bioclastos de moluscos desgastados y wackestone con peloides; Petrográficamente las calizas en este sector están constituidas por circón, glauconita, esparita, micrita, pellets, foraminíferos planctónicos del género Heterohelix, de la especie Moremani y foraminíferos del género Hedbergella y especie Trocoidea; también se pudo observar fósiles de bivalvos y algunas algas. Conclusiones: estas facies permitieron establecer que estas calizas se formaron en un ambiente de plataforma media con algunos intervalos de plataforma externa, abarcando una zona de facie de mar abierto

    The Song Describer dataset: a corpus of audio captions for music-and-language evaluation

    Get PDF
    We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models. The dataset consists of 1.1k human-written natural language descriptions of 706 music recordings, all publicly accessible and released under Creative Common licenses. To showcase the use of our dataset, we benchmark popular models on three key music-and-language tasks (music captioning, text-to-music generation and music-language retrieval). Our experiments highlight the importance of cross dataset evaluation and offer insights into how researchers can use SDD to gain a broader understanding of model performance

    Flavouring Group Evaluation 76 Revision 2 (FGE.76Rev2): Consideration of sulfur-containing heterocyclic compounds, evaluated by JECFA, structurally related to thiazoles, thiophenes, thiazoline and thienyl derivatives from chemical group 29 and miscellaneous substances from chemical group 30 evaluated by EFSA in FGE.21Rev5

    Get PDF
    The Panel on Food Additives and Flavourings (FAF) was requested to consider the JECFA evaluations of 28 flavouring substances in the Flavouring Group Evaluation 76 (FGE.76Rev2). Twenty-one of these substances have been considered in FGE.76Rev1. Seven substances could not be evaluated, because of concerns with respect to genotoxicity. New genotoxicity data have been provided for 4-methyl-5-vinylthiazole [FL-no: 15.018] and 4,5-dimethyl-2-isobutyl-3-thiazoline [FL-no: 15.032], which are representative substances of [FL-no: 15.005] and [FL-no: 15.029, 15.030, 15.130 and 15.131], respectively. The Panel concluded that the concern for genotoxicity is ruled out for [FL-no: 15.018 and 15.005]. The concerns for gene mutations and clastogenicity are ruled out for [FL-no: 15.032, 15.029, 15.030, 15.130 and 15.131]. In vitro, [FL-no: 15.032] induced micronuclei through an aneugenic mode of action. The available in vivo micronucleus study was not adequate to rule out the concern for potential aneugenicity in vivo. The Panel compared the lowest concentration resulting in aneugenicity in vitro with the use levels reported for [FL-no: 15.032]. Based on this comparison, the Panel concluded that the use of [FL-no: 15.032] at the maximum reported use levels does not raise a concern for aneugenicity. Based on structural similarity, for the remaining four substances [FL-no: 15.029, 15.030, 15.130 and 15.131], an aneugenic potential may also be anticipated. Individual genotoxicity data are needed to establish whether they have aneugenic potential. The Panel agrees with JECFA conclusions for 24 flavouring substances 'No safety concern at estimated levels of intake as flavouring substances' when based on the MSDI approach. For six substances, more reliable information on uses and use levels are needed to refine the mTAMDI estimates. For 15 substances, use levels are needed to calculate the mTAMDIs. For [FL-no: 15.109 and 15.113], information on the actual stereochemical composition is inadequate and the conclusion reached for the named substances cannot be applied to the materials of commerce

    The Borexino detector at the Laboratori Nazionali del Gran Sasso

    Full text link
    Borexino, a large volume detector for low energy neutrino spectroscopy, is currently running underground at the Laboratori Nazionali del Gran Sasso, Italy. The main goal of the experiment is the real-time measurement of sub MeV solar neutrinos, and particularly of the mono energetic (862 keV) Be7 electron capture neutrinos, via neutrino-electron scattering in an ultra-pure liquid scintillator. This paper is mostly devoted to the description of the detector structure, the photomultipliers, the electronics, and the trigger and calibration systems. The real performance of the detector, which always meets, and sometimes exceeds, design expectations, is also shown. Some important aspects of the Borexino project, i.e. the fluid handling plants, the purification techniques and the filling procedures, are not covered in this paper and are, or will be, published elsewhere (see Introduction and Bibliography).Comment: 37 pages, 43 figures, to be submitted to NI

    The pediatric NAFLD fibrosis index: a predictor of liver fibrosis in children with non-alcoholic fatty liver disease

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Liver fibrosis is a stage of non-alcoholic fatty liver disease (NAFLD) which is responsible for liver-related morbidity and mortality in adults. Accordingly, the search for non-invasive markers of liver fibrosis has been the subject of intensive efforts in adults with NAFLD. Here, we developed a simple algorithm for the prediction of liver fibrosis in children with NAFLD followed at a tertiary care center.</p> <p>Methods</p> <p>The study included 136 male and 67 female children with NAFLD aged 3.3 to 18.0 years; 141 (69%) of them had fibrosis at liver biopsy. On the basis of biological plausibility, readily availability and evidence from adult studies, we evaluated the following potential predictors of liver fibrosis at bootstrapped stepwise logistic regression: gender, age, body mass index, waist circumference, alanine transaminase, aspartate transaminase, gamma-glutamyl-transferase, albumin, prothrombin time, glucose, insulin, triglycerides and cholesterol. A final model was developed using bootstrapped logistic regression with bias-correction. We used this model to develop the 'pediatric NAFLD fibrosis index' (PNFI), which varies between 0 and 10.</p> <p>Results</p> <p>The final model was based on age, waist circumference and triglycerides and had a area under the receiver operating characteristic curve of 0.85 (95% bootstrapped confidence interval (CI) with bias correction 0.80 to 0.90) for the prediction of liver fibrosis. A PNFI ≥ 9 (positive likelihood ratio = 28.6, 95% CI 4.0 to 201.0; positive predictive value = 98.5, 95% CI 91.8 to 100.0) could be used to rule in liver fibrosis without performing liver biopsy.</p> <p>Conclusion</p> <p>PNFI may help clinicians to predict liver fibrosis in children with NAFLD, but external validation is needed before it can be employed for this purpose.</p

    Scientific Guidance on the data required for the risk assessment of flavourings to be used in or on foods

    Get PDF
    Following a request from the European Commission, EFSA developed a new scientific guidance to assist applicants in the preparation of applications for the authorisation of flavourings to be used in or on foods. This guidance applies to applications for a new authorisation as well as for a modification of an existing authorisation of a food flavouring, submitted under Regulation (EC) No 1331/2008. It defines the scientific data required for the evaluation of those food flavourings for which an evaluation and approval is required according to Article 9 of Regulation (EC) No 1334/2008. This applies to flavouring substances, flavouring preparations, thermal process flavourings, flavour precursors, other flavourings and source materials, as defined in Article 3 of Regulation (EC) No 1334/2008. Information to be provided in all applications relates to: (a) the characterisation of the food flavouring, including the description of its identity, manufacturing process, chemical composition, specifications, stability and reaction and fate in foods; (b) the proposed uses and use levels and the assessment of the dietary exposure and (c) the safety data, including information on the genotoxic potential of the food flavouring, toxicological data other than genotoxicity and information on the safety for the environment. For the toxicological studies, a tiered approach is applied, for which the testing requirements, key issues and triggers are described. Applicants should generate the data requested in each section to support the safety assessment of the food flavouring. Based on the submitted data, EFSA will assess the safety of the food flavouring and conclude whether or not it presents risks to human health and to the environment, if applicable, under the proposed conditions of use
    corecore