5,415 research outputs found

    Predicting Audio Advertisement Quality

    Full text link
    Online audio advertising is a particular form of advertising used abundantly in online music streaming services. In these platforms, which tend to host tens of thousands of unique audio advertisements (ads), providing high quality ads ensures a better user experience and results in longer user engagement. Therefore, the automatic assessment of these ads is an important step toward audio ads ranking and better audio ads creation. In this paper we propose one way to measure the quality of the audio ads using a proxy metric called Long Click Rate (LCR), which is defined by the amount of time a user engages with the follow-up display ad (that is shown while the audio ad is playing) divided by the impressions. We later focus on predicting the audio ad quality using only acoustic features such as harmony, rhythm, and timbre of the audio, extracted from the raw waveform. We discuss how the characteristics of the sound can be connected to concepts such as the clarity of the audio ad message, its trustworthiness, etc. Finally, we propose a new deep learning model for audio ad quality prediction, which outperforms the other discussed models trained on hand-crafted features. To the best of our knowledge, this is the first large-scale audio ad quality prediction study.Comment: WSDM '18 Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 9 page

    The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation

    Full text link
    With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity. Despite very promising progress on image and short sequence generation, symbolic music generation remains a challenging problem since the structure of compositions are usually complicated. In this study, we attempt to solve the melody generation problem constrained by the given chord progression. This music meta-creation problem can also be incorporated into a plan recognition system with user inputs and predictive structural outputs. In particular, we explore the effect of explicit architectural encoding of musical structure via comparing two sequential generative models: LSTM (a type of RNN) and WaveNet (dilated temporal-CNN). As far as we know, this is the first study of applying WaveNet to symbolic music generation, as well as the first systematic comparison between temporal-CNN and RNN for music generation. We conduct a survey for evaluation in our generations and implemented Variable Markov Oracle in music pattern discovery. Experimental results show that to encode structure more explicitly using a stack of dilated convolution layers improved the performance significantly, and a global encoding of underlying chord progression into the generation procedure gains even more.Comment: 8 pages, 13 figure

    Shape-based invariant features extraction for object recognition

    No full text
    International audienceThe emergence of new technologies enables generating large quantity of digital information including images; this leads to an increasing number of generated digital images. Therefore it appears a necessity for automatic systems for image retrieval. These systems consist of techniques used for query specification and re-trieval of images from an image collection. The most frequent and the most com-mon means for image retrieval is the indexing using textual keywords. But for some special application domains and face to the huge quantity of images, key-words are no more sufficient or unpractical. Moreover, images are rich in content; so in order to overcome these mentioned difficulties, some approaches are pro-posed based on visual features derived directly from the content of the image: these are the content-based image retrieval (CBIR) approaches. They allow users to search the desired image by specifying image queries: a query can be an exam-ple, a sketch or visual features (e.g., colour, texture and shape). Once the features have been defined and extracted, the retrieval becomes a task of measuring simi-larity between image features. An important property of these features is to be in-variant under various deformations that the observed image could undergo. In this chapter, we will present a number of existing methods for CBIR applica-tions. We will also describe some measures that are usually used for similarity measurement. At the end, and as an application example, we present a specific ap-proach, that we are developing, to illustrate the topic by providing experimental results

    Beat histogram features for rhythm-based musical genre classification using multiple novelty functions

    Get PDF
    In this paper we present beat histogram features for multiple level rhythm description and evaluate them in a musical genre classification task. Audio features pertaining to various musical content categories and their related novelty functions are extracted as a basis for the creation of beat histograms. The proposed features capture not only amplitude, but also tonal and general spectral changes in the signal, aiming to represent as much rhythmic information as possible. The most and least informative features are identified through feature selection methods and are then tested using Support Vector Machines on five genre datasets concerning classification accuracy against a baseline feature set. Results show that the presented features provide comparable classification accuracy with respect to other genre classification approaches using periodicity histograms and display a performance close to that of much more elaborate up-to-date approaches for rhythm description. The use of bar boundary annotations for the texture frames has provided an improvement for the dance-oriented Ballroom dataset. The comparably small number of descriptors and the possibility of evaluating the influence of specific signal components to the general rhythmic content encourage the further use of the method in rhythm description tasks

    Coronal loop detection from solar images and extraction of salient contour groups from cluttered images.

    Get PDF
    This dissertation addresses two different problems: 1) coronal loop detection from solar images: and 2) salient contour group extraction from cluttered images. In the first part, we propose two different solutions to the coronal loop detection problem. The first solution is a block-based coronal loop mining method that detects coronal loops from solar images by dividing the solar image into fixed sized blocks, labeling the blocks as Loop or Non-Loop , extracting features from the labeled blocks, and finally training classifiers to generate learning models that can classify new image blocks. The block-based approach achieves 64% accuracy in IO-fold cross validation experiments. To improve the accuracy and scalability, we propose a contour-based coronal loop detection method that extracts contours from cluttered regions, then labels the contours as Loop and Non-Loop , and extracts geometric features from the labeled contours. The contour-based approach achieves 85% accuracy in IO-fold cross validation experiments, which is a 20% increase compared to the block-based approach. In the second part, we propose a method to extract semi-elliptical open curves from cluttered regions. Our method consists of the following steps: obtaining individual smooth contours along with their saliency measures; then starting from the most salient contour, searching for possible grouping options for each contour; and continuing the grouping until an optimum solution is reached. Our work involved the design and development of a complete system for coronal loop mining in solar images, which required the formulation of new Gestalt perceptual rules and a systematic methodology to select and combine them in a fully automated judicious manner using machine learning techniques that eliminate the need to manually set various weight and threshold values to define an effective cost function. After finding salient contour groups, we close the gaps within the contours in each group and perform B-spline fitting to obtain smooth curves. Our methods were successfully applied on cluttered solar images from TRACE and STEREO/SECCHI to discern coronal loops. Aerial road images were also used to demonstrate the applicability of our grouping techniques to other contour-types in other real applications
    • …
    corecore