9 research outputs found

    Improving Structure Evaluation Through Automatic Hierarchy Expansion

    Get PDF
    Structural segmentation is the task of partitioning a recording into non-overlapping time intervals, and labeling each segment with an identifying marker such as A, B, or verse. Hierarchical structure annotation expands this idea to allow an annotator to segment a song with multiple levels of granularity. While there has been recent progress in developing evaluation criteria for comparing two hierarchical annotations of the same recording, the existing methods have known deficiencies when dealing with inexact label matchings and sequential label repetition. In this article, we investigate methods for automatically enhancing structural annotations by inferring (and expanding) hierarchical information from the segment labels. The proposed method complements existing techniques for comparing hierarchical structural annotations by coarsening or refining labels with variation markers to either collapse similarly labeled segments together, or separate identically labeled segments from each other. Using the multi-level structure annotations provided in the SALAMI dataset, we demonstrate that automatic hierarchy expansion allows structure comparison methods to more accurately assess similarity between annotations

    Multimodal music information processing and retrieval: survey and future challenges

    Full text link
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

    Multimodal Music Information Processing and Retrieval: Survey and Future Challenges

    Get PDF
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval, and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

    Convolutional Methods for Music Analysis

    Get PDF

    Soundtrack recommendation for images

    Get PDF
    The drastic increase in production of multimedia content has emphasized the research concerning its organization and retrieval. In this thesis, we address the problem of music retrieval when a set of images is given as input query, i.e., the problem of soundtrack recommendation for images. The task at hand is to recommend appropriate music to be played during the presentation of a given set of query images. To tackle this problem, we formulate a hypothesis that the knowledge appropriate for the task is contained in publicly available contemporary movies. Our approach, Picasso, employs similarity search techniques inside the image and music domains, harvesting movies to form a link between the domains. To achieve a fair and unbiased comparison between different soundtrack recommendation approaches, we proposed an evaluation benchmark. The evaluation results are reported for Picasso and the baseline approach, using the proposed benchmark. We further address two efficiency aspects that arise from the Picasso approach. First, we investigate the problem of processing top-K queries with set-defined selections and propose an index structure that aims at minimizing the query answering latency. Second, we address the problem of similarity search in high-dimensional spaces and propose two enhancements to the Locality Sensitive Hashing (LSH) scheme. We also investigate the prospects of a distributed similarity search algorithm based on LSH using the MapReduce framework. Finally, we give an overview of the PicasSound|a smartphone application based on the Picasso approach.Der drastische Anstieg von verfügbaren Multimedia-Inhalten hat die Bedeutung der Forschung über deren Organisation sowie Suche innerhalb der Daten hervorgehoben. In dieser Doktorarbeit betrachten wir das Problem der Suche nach geeigneten Musikstücken als Hintergrundmusik für Diashows. Wir formulieren die Hypothese, dass die für das Problem erforderlichen Kenntnisse in öffentlich zugänglichen, zeitgenössischen Filmen enthalten sind. Unser Ansatz, Picasso, verwendet Techniken aus dem Bereich der Ähnlichkeitssuche innerhalb von Bild- und Musik-Domains, um basierend auf Filmszenen eine Verbindung zwischen beliebigen Bildern und Musikstücken zu lernen. Um einen fairen und unvoreingenommenen Vergleich zwischen verschiedenen Ansätzen zur Musikempfehlung zu erreichen, schlagen wir einen Bewertungs-Benchmark vor. Die Ergebnisse der Auswertung werden, anhand des vorgeschlagenen Benchmarks, für Picasso und einen weiteren, auf Emotionen basierenden Ansatz, vorgestellt. Zusätzlich behandeln wir zwei Effizienzaspekte, die sich aus dem Picasso Ansatz ergeben. (i) Wir untersuchen das Problem der Ausführung von top-K Anfragen, bei denen die Ergebnismenge ad-hoc auf eine kleine Teilmenge des gesamten Indexes eingeschränkt wird. (ii) Wir behandeln das Problem der Ähnlichkeitssuche in hochdimensionalen Räumen und schlagen zwei Erweiterungen des Lokalitätssensitiven Hashing (LSH) Schemas vor. Zusätzlich untersuchen wir die Erfolgsaussichten eines verteilten Algorithmus für die Ähnlichkeitssuche, der auf LSH unter Verwendung des MapReduce Frameworks basiert. Neben den vorgenannten wissenschaftlichen Ergebnissen beschreiben wir ferner das Design und die Implementierung von PicassSound, einer auf Picasso basierenden Smartphone-Anwendung

    Prácticas informacionales de los aficionados a la música: el caso del tango y el metal en la ciudad de Medellín

    Get PDF
    Purpose: The following research study is aimed at understanding the information practices of music fans, particularly fans of tango music and metal music in the city of Medellín. Design/methodology/approach: Based on three conceptual starting points, a social practice theory, the serious leisure perspective, and the information transfer model, a qualitative study was designed, with an exploratory and descriptive scope. In-depth interviews and observations were carried out with 18 music fans, 9 of tango music, 9 of metal music. The collected data was analized deductively using categories found on the conceptual models, and comparing it with the findings of a review of 50 previous studies found on the literature. Findings: The studies found on the literature suggest that up to this moment, music information behaviour has not been studied in a comprehensive manner, rather each activity has been viewed as isolated from others. The statements gathered from fans allow us to assert that an information practice is a coherent whole, all its activities are connected to each other and integrated with each person‟s lifestyle. Personal factors sucha as general understandings, values and world view are the ones that have greater incidence upon the practice, more so tan sociodemographic factors. A model for information practice is presented that describes all these elements. Originality/value: This study proposes a new theoretical and practical approach to the subject. The findings are an original and novel description upon which it is posible to conduct further studies

    Prácticas informacionales de los aficionados a la música: el caso del tango y el metal en la ciudad de Medellín

    Get PDF
    Purpose: The following research study is aimed at understanding the information practices of music fans, particularly fans of tango music and metal music in the city of Medellín. Design/methodology/approach: Based on three conceptual starting points, a social practice theory, the serious leisure perspective, and the information transfer model, a qualitative study was designed, with an exploratory and descriptive scope. In-depth interviews and observations were carried out with 18 music fans, 9 of tango music, 9 of metal music. The collected data was analized deductively using categories found on the conceptual models, and comparing it with the findings of a review of 50 previous studies found on the literature. Findings: The studies found on the literature suggest that up to this moment, music information behaviour has not been studied in a comprehensive manner, rather each activity has been viewed as isolated from others. The statements gathered from fans allow us to assert that an information practice is a coherent whole, all its activities are connected to each other and integrated with each person‟s lifestyle. Personal factors sucha as general understandings, values and world view are the ones that have greater incidence upon the practice, more so tan sociodemographic factors. A model for information practice is presented that describes all these elements. Originality/value: This study proposes a new theoretical and practical approach to the subject. The findings are an original and novel description upon which it is posible to conduct further studies
    corecore