36 research outputs found

    A Comprehensive Survey on Heart Sound Analysis in the Deep Learning Era

    Full text link
    Heart sound auscultation has been demonstrated to be beneficial in clinical usage for early screening of cardiovascular diseases. Due to the high requirement of well-trained professionals for auscultation, automatic auscultation benefiting from signal processing and machine learning can help auxiliary diagnosis and reduce the burdens of training professional clinicians. Nevertheless, classic machine learning is limited to performance improvement in the era of big data. Deep learning has achieved better performance than classic machine learning in many research fields, as it employs more complex model architectures with stronger capability of extracting effective representations. Deep learning has been successfully applied to heart sound analysis in the past years. As most review works about heart sound analysis were given before 2017, the present survey is the first to work on a comprehensive overview to summarise papers on heart sound analysis with deep learning in the past six years 2017--2022. We introduce both classic machine learning and deep learning for comparison, and further offer insights about the advances and future research directions in deep learning for heart sound analysis

    An investigation of cross-cultural semi-supervised learning for continuous affect recognition

    Get PDF
    One of the keys for supervised learning techniques to succeed resides in the access to vast amounts of labelled training data. The process of data collection, however, is expensive, time- consuming, and application dependent. In the current digital era, data can be collected continuously. This continuity renders data annotation into an endless task, which potentially, in problems such as emotion recognition, requires annotators with different cultural backgrounds. Herein, we study the impact of utilising data from different cultures in a semi-supervised learning ap- proach to label training material for the automatic recognition of arousal and valence. Specifically, we compare the performance of culture-specific affect recognition models trained with man- ual or cross-cultural automatic annotations. The experiments performed in this work use the dataset released for the Cross- cultural Emotion Sub-challenge of the Audio/Visual Emotion Challenge (AVEC) 2019. The results obtained convey that the cultures used for training impact on the system performance. Furthermore, in most of the scenarios assessed, affect recogni- tion models trained with hybrid solutions, combining manual and automatic annotations, surpass the baseline model, which was exclusively trained with manual annotations

    A Method for Detecting Murmurous Heart Sounds based on Self-similar Properties

    Full text link
    A heart murmur is an atypical sound produced by the flow of blood through the heart. It can be a sign of a serious heart condition, so detecting heart murmurs is critical for identifying and managing cardiovascular diseases. However, current methods for identifying murmurous heart sounds do not fully utilize the valuable insights that can be gained by exploring intrinsic properties of heart sound signals. To address this issue, this study proposes a new discriminatory set of multiscale features based on the self-similarity and complexity properties of heart sounds, as derived in the wavelet domain. Self-similarity is characterized by assessing fractal behaviors, while complexity is explored by calculating wavelet entropy. We evaluated the diagnostic performance of these proposed features for detecting murmurs using a set of standard classifiers. When applied to a publicly available heart sound dataset, our proposed wavelet-based multiscale features achieved comparable performance to existing methods with fewer features. This suggests that self-similarity and complexity properties in heart sounds could be potential biomarkers for improving the accuracy of murmur detection

    Classification of Broadcast News Audio Data Employing Binary Decision Architecture

    Get PDF
    A novel binary decision architecture (BDA) for broadcast news audio classification task is presented in this paper. The idea of developing such architecture came from the fact that the appropriate combination of multiple binary classifiers for two-class discrimination problem can reduce a miss-classification error without rapid increase in computational complexity. The core element of classification architecture is represented by a binary decision (BD) algorithm that performs discrimination between each pair of acoustic classes, utilizing two types of decision functions. The first one is represented by a simple rule-based approach in which the final decision is made according to the value of selected discrimination parameter. The main advantage of this solution is relatively low processing time needed for classification of all acoustic classes. The cost for that is low classification accuracy. The second one employs support vector machine (SVM) classifier. In this case, the overall classification accuracy is conditioned by finding the optimal parameters for decision function resulting in higher computational complexity and better classification performance. The final form of proposed BDA is created by combining four BD discriminators supplemented by decision table. The effectiveness of proposed BDA, utilizing rule-based approach and the SVM classifier, is compared with two most popular strategies for multiclass classification, namely the binary decision trees (BDT) and the One-Against-One SVM (OAOSVM). Experimental results show that the proposed classification architecture can decrease the overall classification error in comparison with the BDT architecture. On the contrary, an optimization technique for selecting the optimal set of training data is needed in order to overcome the OAOSVM

    Multiple Instance Learning: A Survey of Problem Characteristics and Applications

    Full text link
    Multiple instance learning (MIL) is a form of weakly supervised learning where training instances are arranged in sets, called bags, and a label is provided for the entire bag. This formulation is gaining interest because it naturally fits various problems and allows to leverage weakly labeled data. Consequently, it has been used in diverse application fields such as computer vision and document classification. However, learning from bags raises important challenges that are unique to MIL. This paper provides a comprehensive survey of the characteristics which define and differentiate the types of MIL problems. Until now, these problem characteristics have not been formally identified and described. As a result, the variations in performance of MIL algorithms from one data set to another are difficult to explain. In this paper, MIL problem characteristics are grouped into four broad categories: the composition of the bags, the types of data distribution, the ambiguity of instance labels, and the task to be performed. Methods specialized to address each category are reviewed. Then, the extent to which these characteristics manifest themselves in key MIL application areas are described. Finally, experiments are conducted to compare the performance of 16 state-of-the-art MIL methods on selected problem characteristics. This paper provides insight on how the problem characteristics affect MIL algorithms, recommendations for future benchmarking and promising avenues for research

    Machine Learning for Auditory Hierarchy

    Get PDF
    Coleman, W. (2021). Machine Learning for Auditory Hierarchy. This dissertation is submitted for the degree of Doctor of Philosophy, Technological University Dublin. Audio content is predominantly delivered in a stereo audio file of a static, pre-formed mix. The content creator makes volume, position and effects decisions, generally for presentation in stereo speakers, but has no control ultimately over how the content will be consumed. This leads to poor listener experience when, for example, a feature film is mixed such that the dialogue is at a low level relative to the sound effects. Consumers can complain that they must turn the volume up to hear the words, but back down again because the effects levels are too loud. Addressing this problem requires a television mix optimised for the stereo speakers used in the vast majority of homes, which is not always available

    Sound design: an artificial intelligence approach

    Get PDF

    Predicting the emotions expressed in music

    Get PDF

    State-of-the-Art Sensors Technology in Spain 2015: Volume 1

    Get PDF
    This book provides a comprehensive overview of state-of-the-art sensors technology in specific leading areas. Industrial researchers, engineers and professionals can find information on the most advanced technologies and developments, together with data processing. Further research covers specific devices and technologies that capture and distribute data to be processed by applying dedicated techniques or procedures, which is where sensors play the most important role. The book provides insights and solutions for different problems covering a broad spectrum of possibilities, thanks to a set of applications and solutions based on sensory technologies. Topics include: • Signal analysis for spectral power • 3D precise measurements • Electromagnetic propagation • Drugs detection • e-health environments based on social sensor networks • Robots in wireless environments, navigation, teleoperation, object grasping, demining • Wireless sensor networks • Industrial IoT • Insights in smart cities • Voice recognition • FPGA interfaces • Flight mill device for measurements on insects • Optical systems: UV, LEDs, lasers, fiber optics • Machine vision • Power dissipation • Liquid level in fuel tanks • Parabolic solar tracker • Force sensors • Control for a twin roto
    corecore