2,227 research outputs found
Frequency shifting approach towards textual transcription of heartbeat sounds
Auscultation is an approach for diagnosing many cardiovascular problems. Automatic analysis of heartbeat sounds and extraction of its audio features can assist physicians towards diagnosing diseases. Textual transcription allows recording a continuous heart sound stream using a text format which can be stored in very small memory in comparison with other audio formats. In addition, a text-based data allows applying indexing and searching techniques to access to the critical events. Hence, the transcribed heartbeat sounds provides useful information to monitor the behavior of a patient for the long duration of time. This paper proposes a frequency shifting method in order to improve the performance of the transcription. The main objective of this study is to transfer the heartbeat sounds to the music domain. The proposed technique is tested with 100 samples which were recorded from different heart diseases categories. The observed results show that, the proposed shifting method significantly improves the performance of the transcription
A Feature Learning Siamese Model for Intelligent Control of the Dynamic Range Compressor
In this paper, a siamese DNN model is proposed to learn the characteristics
of the audio dynamic range compressor (DRC). This facilitates an intelligent
control system that uses audio examples to configure the DRC, a widely used
non-linear audio signal conditioning technique in the areas of music
production, speech communication and broadcasting. Several alternative siamese
DNN architectures are proposed to learn feature embeddings that can
characterise subtle effects due to dynamic range compression. These models are
compared with each other as well as handcrafted features proposed in previous
work. The evaluation of the relations between the hyperparameters of DNN and
DRC parameters are also provided. The best model is able to produce a universal
feature embedding that is capable of predicting multiple DRC parameters
simultaneously, which is a significant improvement from our previous research.
The feature embedding shows better performance than handcrafted audio features
when predicting DRC parameters for both mono-instrument audio loops and
polyphonic music pieces.Comment: 8 pages, accepted in IJCNN 201
Automatic comparison of global children’s and adult songs supports a sensorimotor hypothesis for the origin of musical scales
Music throughout the world varies greatly, yet some musical features like scale structure display striking crosscultural similarities. Are there musical laws or biological constraints that underlie this diversity? The “vocal mistuning” hypothesis proposes that cross-cultural regularities in musical scales arise from imprecision in vocal tuning, while the integer-ratio hypothesis proposes that they arise from perceptual principles based on psychoacoustic consonance. In order to test these hypotheses, we conducted automatic comparative analysis of 100 children’s and adult songs from throughout the world. We found that children’s songs tend to have narrower melodic range, fewer scale degrees, and less precise intonation than adult songs, consistent with motor limitations due to their earlier developmental stage. On the other hand, adult and children’s songs share some common tuning intervals at small-integer ratios, particularly the perfect 5th (~3:2 ratio). These results suggest that some widespread aspects of musical scales may be caused by motor constraints, but also suggest that perceptual preferences for simple integer ratios might contribute to cross-cultural regularities in scale structure. We propose a “sensorimotor hypothesis” to unify these competing theories
- …