51,943 research outputs found

    Vision-based Detection of Acoustic Timed Events: a Case Study on Clarinet Note Onsets

    Get PDF
    Acoustic events often have a visual counterpart. Knowledge of visual information can aid the understanding of complex auditory scenes, even when only a stereo mixdown is available in the audio domain, \eg identifying which musicians are playing in large musical ensembles. In this paper, we consider a vision-based approach to note onset detection. As a case study we focus on challenging, real-world clarinetist videos and carry out preliminary experiments on a 3D convolutional neural network based on multiple streams and purposely avoiding temporal pooling. We release an audiovisual dataset with 4.5 hours of clarinetist videos together with cleaned annotations which include about 36,000 onsets and the coordinates for a number of salient points and regions of interest. By performing several training trials on our dataset, we learned that the problem is challenging. We found that the CNN model is highly sensitive to the optimization algorithm and hyper-parameters, and that treating the problem as binary classification may prevent the joint optimization of precision and recall. To encourage further research, we publicly share our dataset, annotations and all models and detail which issues we came across during our preliminary experiments.Comment: Proceedings of the First International Conference on Deep Learning and Music, Anchorage, US, May, 2017 (arXiv:1706.08675v1 [cs.NE]

    Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription

    Get PDF
    In this paper, a method for automatic transcription of music signals based on joint multiple-F0 estimation is proposed. As a time-frequency representation, the constant-Q resonator time-frequency image is employed, while a novel noise suppression technique based on pink noise assumption is applied in a preprocessing step. In the multiple-F0 estimation stage, the optimal tuning and inharmonicity parameters are computed and a salience function is proposed in order to select pitch candidates. For each pitch candidate combination, an overlapping partial treatment procedure is used, which is based on a novel spectral envelope estimation procedure for the log-frequency domain, in order to compute the harmonic envelope of candidate pitches. In order to select the optimal pitch combination for each time frame, a score function is proposed which combines spectral and temporal characteristics of the candidate pitches and also aims to suppress harmonic errors. For postprocessing, hidden Markov models (HMMs) and conditional random fields (CRFs) trained on MIDI data are employed, in order to boost transcription accuracy. The system was trained on isolated piano sounds from the MAPS database and was tested on classic and jazz recordings from the RWC database, as well as on recordings from a Disklavier piano. A comparison with several state-of-the-art systems is provided using a variety of error metrics, where encouraging results are indicated

    Algorithmic Clustering of Music

    Full text link
    We present a fully automatic method for music classification, based only on compression of strings that represent the music pieces. The method uses no background knowledge about music whatsoever: it is completely general and can, without change, be used in different areas like linguistic classification and genomics. It is based on an ideal theory of the information content in individual objects (Kolmogorov complexity), information distance, and a universal similarity metric. Experiments show that the method distinguishes reasonably well between various musical genres and can even cluster pieces by composer.Comment: 17 pages, 11 figure

    Concept Blending and Dissimilarity: Factors for Creative Design Process: A Comparison between the Linguistic Interpretation Process and Design Process

    Get PDF
    This study investigated the design process in order to clarify the characteristics of the essence of the creative design process vis-à-vis the interpretation process, by carrying out design experiments. The authors analyzed the characteristics of the creative design process by comparing it with the linguistic interpretation process, from the viewpoints of thought types (analogy, blending, and thematic relation) and recognition types (commonalities and alignable and nonalignable differences). A new concept can be created by using the noun-noun phrase as the process of synthesizing two concepts—the simplest and most essential process in formulating a new concept from existing ones. Furthermore, the noun-noun phrase can be interpreted in a natural way. In our experiment, the subjects were required to interpret a novel noun-noun phrase, create a design concept from the same noun-noun phrase, and list the similarities and dissimilarities between the two nouns. The authors compare the results of the thought types and recognition types, focusing on the perspective of the manner in which things were viewed, i.e., in terms of similarities and dissimilarities. A comparison of the results reveals that blending and nonalignable differences characterize the creative design process. The findings of this research will contribute a framework of design practice, to enhance both students’ and designers’ creativity for concept formation in design, which relates to the development of innovative design. Keywords: Noun-Noun phrase; Design; Creativity; Blending; Nonalignable difference</p
    • …
    corecore