20 research outputs found

    Multimodal music information processing and retrieval: survey and future challenges

    Full text link
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

    Improving Structure Evaluation Through Automatic Hierarchy Expansion

    Get PDF
    Structural segmentation is the task of partitioning a recording into non-overlapping time intervals, and labeling each segment with an identifying marker such as A, B, or verse. Hierarchical structure annotation expands this idea to allow an annotator to segment a song with multiple levels of granularity. While there has been recent progress in developing evaluation criteria for comparing two hierarchical annotations of the same recording, the existing methods have known deficiencies when dealing with inexact label matchings and sequential label repetition. In this article, we investigate methods for automatically enhancing structural annotations by inferring (and expanding) hierarchical information from the segment labels. The proposed method complements existing techniques for comparing hierarchical structural annotations by coarsening or refining labels with variation markers to either collapse similarly labeled segments together, or separate identically labeled segments from each other. Using the multi-level structure annotations provided in the SALAMI dataset, we demonstrate that automatic hierarchy expansion allows structure comparison methods to more accurately assess similarity between annotations

    The Skipping Behavior of Users of Music Streaming Services and its Relation to Musical Structure

    Full text link
    The behavior of users of music streaming services is investigated from the point of view of the temporal dimension of individual songs; specifically, the main object of the analysis is the point in time within a song at which users stop listening and start streaming another song ("skip"). The main contribution of this study is the ascertainment of a correlation between the distribution in time of skipping events and the musical structure of songs. It is also shown that such distribution is not only specific to the individual songs, but also independent of the cohort of users and, under stationary conditions, date of observation. Finally, user behavioral data is used to train a predictor of the musical structure of a song solely from its acoustic content; it is shown that the use of such data, available in large quantities to music streaming services, yields significant improvements in accuracy over the customary fashion of training this class of algorithms, in which only smaller amounts of hand-labeled data are available

    Pitchclass2vec: Symbolic Music Structure Segmentation with Chord Embeddings

    Get PDF
    Structure perception is a fundamental aspect of music cognition in humans. Historically, the hierarchical organization of music into structures served as a narrative device for conveying meaning, creating expectancy, and evoking emotions in the listener. Thereby, musical structures play an essential role in music composition, as they shape the musical discourse through which the composer organises his ideas. In this paper, we present a novel music segmentation method, pitchclass2vec, based on symbolic chord annotations, which are embedded into continuous vector representations using both natural language processing techniques and custom-made encodings. Our algorithm is based on long-short term memory (LSTM) neural network and outperforms the state-of-the-art techniques based on symbolic chord annotations in the field

    Crowdsourcing Emotions in Music Domain

    Get PDF
    An important source of intelligence for music emotion recognition today comes from user-provided community tags about songs or artists. Recent crowdsourcing approaches such as harvesting social tags, design of collaborative games and web services or the use of Mechanical Turk, are becoming popular in the literature. They provide a cheap, quick and efficient method, contrary to professional labeling of songs which is expensive and does not scale for creating large datasets. In this paper we discuss the viability of various crowdsourcing instruments providing examples from research works. We also share our own experience, illustrating the steps we followed using tags collected from Last.fm for the creation of two music mood datasets which are rendered public. While processing affect tags of Last.fm, we observed that they tend to be biased towards positive emotions; the resulting dataset thus contain more positive songs than negative ones

    Razvoj platforme Trubadur in novi izzivi v prihajajočih letih

    Get PDF
    Trubadur je odprtokodna platforma za urjenje glasbenega posluha z avtomatiziranimi vajami ritmičnega in intervalnega nareka. Platformo smo ovrednotili z dijaki Konservatorija za glasbo in balet Ljubljana v šolskih letih 2018/19–2020/21. Rezultati evalvacije so pokazali, da lahko uporaba platforme poveča uspešnost pri testih in predstavlja dopolnitev učenja na daljavo

    Feature selection for content-based, time-varying musical emotion regression

    Full text link

    Automatic Personalized Playlist Generation

    Get PDF
    Käesolevas magistritöös on esitatud automaatse personaliseeritud pleilisti tekitaja probleemi lähenemisviiside uuring. Lisaks teoreetilise tausta lühiülevaatele me dokumenteerisime oma lähenemist: meie poolt tehtud katsed ning nende tulemused. Meie algoritm koosneb kahest põhiosast: pleilisti hindamisfunktsiooni konstrueerimine ning pleilisti genereerimisstrateegia valik. Esimese ülesande lahendamiseks on valitud Naive Bayes klassifitseerija ning 5-elemendiline MIRtoolbox tööristakasti poolt kavandatud audio sisupõhiste attribuutide vektor, mis klassiitseerivad pleilisti heaks või halvaks 82% täpsusega - palju parem kui juhuslik klassifitseerija (50%). Teise probleemi lahendamiseks proovisime kolm genereerimisalgoritmi: lohistus (Shuffle), randomiseeritud otsing (Randomized Search) ning geneetiline algoritm (Genetic Algorithm). Vastavalt katsete tulemustele kõige paremini ja kiiremini töötab randomiseeritud otsingu algoritm. Kõik katsed on tehtud 5 ning 10 elemendilistel pleilistidel. Kokkuvõttes, oleme arendanud automatiseeritud personaliseeritud pleilisti tekitaja algoritmi, mis vastavalt meie hinnangutele vastab ka kasutaja ootustele rohkem, kui juhuslikud lohistajad. Algoritmi võib kasutada keerulisema pleilistide konstrueerimiseks

    Multimodal Music Information Processing and Retrieval: Survey and Future Challenges

    Get PDF
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval, and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years
    corecore