43 research outputs found

    Predicting Audio Advertisement Quality

    Full text link
    Online audio advertising is a particular form of advertising used abundantly in online music streaming services. In these platforms, which tend to host tens of thousands of unique audio advertisements (ads), providing high quality ads ensures a better user experience and results in longer user engagement. Therefore, the automatic assessment of these ads is an important step toward audio ads ranking and better audio ads creation. In this paper we propose one way to measure the quality of the audio ads using a proxy metric called Long Click Rate (LCR), which is defined by the amount of time a user engages with the follow-up display ad (that is shown while the audio ad is playing) divided by the impressions. We later focus on predicting the audio ad quality using only acoustic features such as harmony, rhythm, and timbre of the audio, extracted from the raw waveform. We discuss how the characteristics of the sound can be connected to concepts such as the clarity of the audio ad message, its trustworthiness, etc. Finally, we propose a new deep learning model for audio ad quality prediction, which outperforms the other discussed models trained on hand-crafted features. To the best of our knowledge, this is the first large-scale audio ad quality prediction study.Comment: WSDM '18 Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 9 page

    Sequential decision making in artificial musical intelligence

    Get PDF
    Over the past 60 years, artificial intelligence has grown from a largely academic field of research to a ubiquitous array of tools and approaches used in everyday technology. Despite its many recent successes and growing prevalence, certain meaningful facets of computational intelligence have not been as thoroughly explored. Such additional facets cover a wide array of complex mental tasks which humans carry out easily, yet are difficult for computers to mimic. A prime example of a domain in which human intelligence thrives, but machine understanding is still fairly limited, is music. Over the last decade, many researchers have applied computational tools to carry out tasks such as genre identification, music summarization, music database querying, and melodic segmentation. While these are all useful algorithmic solutions, we are still a long way from constructing complete music agents, able to mimic (at least partially) the complexity with which humans approach music. One key aspect which hasn't been sufficiently studied is that of sequential decision making in musical intelligence. This thesis strives to answer the following question: Can a sequential decision making perspective guide us in the creation of better music agents, and social agents in general? And if so, how? More specifically, this thesis focuses on two aspects of musical intelligence: music recommendation and human-agent (and more generally agent-agent) interaction in the context of music. The key contributions of this thesis are the design of better music playlist recommendation algorithms; the design of algorithms for tracking user preferences over time; new approaches for modeling people's behavior in situations that involve music; and the design of agents capable of meaningful interaction with humans and other agents in a setting where music plays a roll (either directly or indirectly). Though motivated primarily by music-related tasks, and focusing largely on people's musical preferences, this thesis also establishes that insights from music-specific case studies can also be applicable in other concrete social domains, such as different types of content recommendation. Showing the generality of insights from musical data in other contexts serves as evidence for the utility of music domains as testbeds for the development of general artificial intelligence techniques. Ultimately, this thesis demonstrates the overall usefulness of taking a sequential decision making approach in settings previously unexplored from this perspectiveComputer Science

    Signal processing methods for beat tracking, music segmentation, and audio retrieval

    Get PDF
    The goal of music information retrieval (MIR) is to develop novel strategies and techniques for organizing, exploring, accessing, and understanding music data in an efficient manner. The conversion of waveform-based audio data into semantically meaningful feature representations by the use of digital signal processing techniques is at the center of MIR and constitutes a difficult field of research because of the complexity and diversity of music signals. In this thesis, we introduce novel signal processing methods that allow for extracting musically meaningful information from audio signals. As main strategy, we exploit musical knowledge about the signals\u27 properties to derive feature representations that show a significant degree of robustness against musical variations but still exhibit a high musical expressiveness. We apply this general strategy to three different areas of MIR: Firstly, we introduce novel techniques for extracting tempo and beat information, where we particularly consider challenging music with changing tempo and soft note onsets. Secondly, we present novel algorithms for the automated segmentation and analysis of folk song field recordings, where one has to cope with significant fluctuations in intonation and tempo as well as recording artifacts. Thirdly, we explore a cross-version approach to content-based music retrieval based on the query-by-example paradigm. In all three areas, we focus on application scenarios where strong musical variations make the extraction of musically meaningful information a challenging task.Ziel der automatisierten Musikverarbeitung ist die Entwicklung neuer Strategien und Techniken zur effizienten Organisation großer Musiksammlungen. Ein Schwerpunkt liegt in der Anwendung von Methoden der digitalen Signalverarbeitung zur Umwandlung von Audiosignalen in musikalisch aussagekräftige Merkmalsdarstellungen. Große Herausforderungen bei dieser Aufgabe ergeben sich aus der Komplexität und Vielschichtigkeit der Musiksignale. In dieser Arbeit werden neuartige Methoden vorgestellt, mit deren Hilfe musikalisch interpretierbare Information aus Musiksignalen extrahiert werden kann. Hierbei besteht eine grundlegende Strategie in der konsequenten Ausnutzung musikalischen Vorwissens, um Merkmalsdarstellungen abzuleiten die zum einen ein hohes Maß an Robustheit gegenüber musikalischen Variationen und zum anderen eine hohe musikalische Ausdruckskraft besitzen. Dieses Prinzip wenden wir auf drei verschieden Aufgabenstellungen an: Erstens stellen wir neuartige Ansätze zur Extraktion von Tempo- und Beat-Information aus Audiosignalen vor, die insbesondere auf anspruchsvolle Szenarien mit wechselnden Tempo und weichen Notenanfängen angewendet werden. Zweitens tragen wir mit neuartigen Algorithmen zur Segmentierung und Analyse von Feldaufnahmen von Volksliedern unter Vorliegen großer Intonationsschwankungen bei. Drittens entwickeln wir effiziente Verfahren zur inhaltsbasierten Suche in großen Datenbeständen mit dem Ziel, verschiedene Interpretationen eines Musikstückes zu detektieren. In allen betrachteten Szenarien richten wir unser Augenmerk insbesondere auf die Fälle in denen auf Grund erheblicher musikalischer Variationen die Extraktion musikalisch aussagekräftiger Informationen eine große Herausforderung darstellt

    An analysis of factors contributing to sixth-grade students' selective attention to music elements: melodic contour, timbre, rhythm, and tempo; and variables associated with demographics, self-perception, music background, music genre, and temporal difference.

    Get PDF
    Two research questions were formulated for the present study: (1) Are there significant differences (p < .05) among sixth-grade participants’ selective attention to music elements as affected by variables associated with music genre and temporal difference?; and (2) To what extent do the following variables significantly predict (p < .05) sixth-grade participants’ selective attention to melodic contour, timbre, rhythm, and tempo: demographics, self-perception, music background, music genre, and temporal difference? Subjects (N = 87), suburban middle school students from the sixth-grade level within Fulton County Public Schools of Atlanta, Georgia completed the Music Background Questionnaire II, Self-Perception Profile for Adolescents (SPPA), and the Music Element Profile (MEP). The first research question was analyzed using a Three-Way Repeated Analysis of Variance. Regarding differences among selective attention to music elements, participants rated rhythm (M = 5.15) significantly higher (p < .01) than melodic contour (M = 4.74), timbre (M = 4.87), or tempo (M = 4.82). Regarding differences among music genre, participants rated rhythm and blues (M = 5.12) significantly higher than jazz (M = 4.83; p < .05) or classical (M = 4.66; p < .01); participants rated rock (M = 4.98) significantly higher (p < .01) than classical (M = 4.66). Regarding differences between fast and slow tempi, participants did not rate fast tempi (M = 4.94) significantly differently than slow tempos (M = 4.86). A significant two-way interaction effect (p < .05) was found among participants’ selective attention for music elements by genre (p = .006). A significant two-way interaction effect (p < .05) was found among sixth-grade students’ selective attention for music elements by temporal difference (p = .002). A significant two-way interaction effect (p < .05) was found among sixth-grade students’ ratings for music genre by temporal difference (p = .000). No significant three-way interaction effects (p < .05) were found among sixth-grade students’ selective attention for music elements by ratings for music genre and temporal difference. Data from the MEP, MBQII, SPPA, and from the demographic information were analyzed in four multiple regression procedures, each placing a different music element as the dependent variable. Classical and rock were found to be the best predictors (p < .001) of melodic contour. Fast tempi were found to be the best predictor (p < .001) of timbre. Classical, rock, rhythm and blues, jazz, and fast tempi were found to be the best predictors (p < .05) of rhythm. Jazz and fast tempi were found to be the best predictors (p < .05) of tempo. From the results of the data analysis of both research questions, conclusions were drawn to provide suggestions for future research

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Methods for large-scale data analyses of regional language variation based on speech acoustics

    Get PDF

    An integrative computational modelling of music structure apprehension

    Get PDF
    corecore