65 research outputs found

    Advanced Music Audio Feature Learning with Deep Networks

    Get PDF
    Music is a means of reflecting and expressing emotion. Personal preferences in music vary between individuals, influenced by situational and environmental factors. Inspired by attempts to develop alternative feature extraction methods for audio signals, this research analyzes the use of deep network structures for extracting features from musical audio data represented in the frequency domain. Image-based network models are designed to be robust and accurate learners of image features. As such, this research develops image-based ImageNet deep network models to learn feature data from music audio spectrograms. This research also explores the use of an audio source separation tool for preprocessing the musical audio before training the network models. The use of source separation allows the network model to learn features that highlight individual contributions to the audio track, and use those features to improve classification results. The features extracted from the data are used to highlight characteristics of the audio tracks, which are then used to train classifiers that categorize the musical data for genre and auto-tag classifications. The results obtained from each model are contrasted with state-of-the-art methods of classification and tag prediction for musical tracks. Deeper networks with input source separation are shown to yield the best results

    Intelligent Sensors for Human Motion Analysis

    Get PDF
    The book, "Intelligent Sensors for Human Motion Analysis," contains 17 articles published in the Special Issue of the Sensors journal. These articles deal with many aspects related to the analysis of human movement. New techniques and methods for pose estimation, gait recognition, and fall detection have been proposed and verified. Some of them will trigger further research, and some may become the backbone of commercial systems

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes
    • …
    corecore