Search CORE

36 research outputs found

A Comprehensive Survey on Heart Sound Analysis in the Deep Learning Era

Author: Chang Yi
Nguyen Thanh Tam
Qian Kun
Ren Zhao
Schuller Björn W.
Tan Yang
Publication venue
Publication date: 23/01/2023
Field of study

Heart sound auscultation has been demonstrated to be beneficial in clinical usage for early screening of cardiovascular diseases. Due to the high requirement of well-trained professionals for auscultation, automatic auscultation benefiting from signal processing and machine learning can help auxiliary diagnosis and reduce the burdens of training professional clinicians. Nevertheless, classic machine learning is limited to performance improvement in the era of big data. Deep learning has achieved better performance than classic machine learning in many research fields, as it employs more complex model architectures with stronger capability of extracting effective representations. Deep learning has been successfully applied to heart sound analysis in the past years. As most review works about heart sound analysis were given before 2017, the present survey is the first to work on a comprehensive overview to summarise papers on heart sound analysis with deep learning in the past six years 2017--2022. We introduce both classic machine learning and deep learning for comparison, and further offer insights about the advances and future research directions in deep learning for heart sound analysis

arXiv.org e-Print Archive

An investigation of cross-cultural semi-supervised learning for continuous affect recognition

Author: Cummins Nicholas
Mallol-Ragolta Adria
Schuller Björn
Publication venue
Publication date: 01/01/2020
Field of study

One of the keys for supervised learning techniques to succeed resides in the access to vast amounts of labelled training data. The process of data collection, however, is expensive, time- consuming, and application dependent. In the current digital era, data can be collected continuously. This continuity renders data annotation into an endless task, which potentially, in problems such as emotion recognition, requires annotators with different cultural backgrounds. Herein, we study the impact of utilising data from different cultures in a semi-supervised learning ap- proach to label training material for the automatic recognition of arousal and valence. Specifically, we compare the performance of culture-specific affect recognition models trained with man- ual or cross-cultural automatic annotations. The experiments performed in this work use the dataset released for the Cross- cultural Emotion Sub-challenge of the Audio/Visual Emotion Challenge (AVEC) 2019. The results obtained convey that the cultures used for training impact on the system performance. Furthermore, in most of the scenarios assessed, affect recogni- tion models trained with hybrid solutions, combining manual and automatic annotations, surpass the baseline model, which was exclusively trained with manual annotations

OPUS Augsburg

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

A Method for Detecting Murmurous Heart Sounds based on Self-similar Properties

Author: Lee Chihoon
Vidakovic Brani
Vimalajeewa Dixon
Publication venue
Publication date: 28/05/2023
Field of study

A heart murmur is an atypical sound produced by the flow of blood through the heart. It can be a sign of a serious heart condition, so detecting heart murmurs is critical for identifying and managing cardiovascular diseases. However, current methods for identifying murmurous heart sounds do not fully utilize the valuable insights that can be gained by exploring intrinsic properties of heart sound signals. To address this issue, this study proposes a new discriminatory set of multiscale features based on the self-similarity and complexity properties of heart sounds, as derived in the wavelet domain. Self-similarity is characterized by assessing fractal behaviors, while complexity is explored by calculating wavelet entropy. We evaluated the diagnostic performance of these proposed features for detecting murmurs using a set of standard classifiers. When applied to a publicly available heart sound dataset, our proposed wavelet-based multiscale features achieved comparable performance to existing methods with fewer features. This suggests that self-similarity and complexity properties in heart sounds could be potential biomarkers for improving the accuracy of murmur detection

arXiv.org e-Print Archive

Classification of Broadcast News Audio Data Employing Binary Decision Architecture

Author: Feciľak Peter
Juhár Jozef
Vavrek Jozef
Čižmár Anton
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 29/11/2017
Field of study

A novel binary decision architecture (BDA) for broadcast news audio classification task is presented in this paper. The idea of developing such architecture came from the fact that the appropriate combination of multiple binary classifiers for two-class discrimination problem can reduce a miss-classification error without rapid increase in computational complexity. The core element of classification architecture is represented by a binary decision (BD) algorithm that performs discrimination between each pair of acoustic classes, utilizing two types of decision functions. The first one is represented by a simple rule-based approach in which the final decision is made according to the value of selected discrimination parameter. The main advantage of this solution is relatively low processing time needed for classification of all acoustic classes. The cost for that is low classification accuracy. The second one employs support vector machine (SVM) classifier. In this case, the overall classification accuracy is conditioned by finding the optimal parameters for decision function resulting in higher computational complexity and better classification performance. The final form of proposed BDA is created by combining four BD discriminators supplemented by decision table. The effectiveness of proposed BDA, utilizing rule-based approach and the SVM classifier, is compared with two most popular strategies for multiclass classification, namely the binary decision trees (BDT) and the One-Against-One SVM (OAOSVM). Experimental results show that the proposed classification architecture can decrease the overall classification error in comparison with the BDT architecture. On the contrary, an optimization technique for selecting the optimal set of training data is needed in order to overcome the OAOSVM

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Multiple Instance Learning: A Survey of Problem Characteristics and Applications

Author: Carbonneau Marc-André
Cheplygina Veronika
Gagnon Ghyslain
Granger Eric
Publication venue: 'Elsevier BV'
Publication date: 10/12/2016
Field of study

Multiple instance learning (MIL) is a form of weakly supervised learning where training instances are arranged in sets, called bags, and a label is provided for the entire bag. This formulation is gaining interest because it naturally fits various problems and allows to leverage weakly labeled data. Consequently, it has been used in diverse application fields such as computer vision and document classification. However, learning from bags raises important challenges that are unique to MIL. This paper provides a comprehensive survey of the characteristics which define and differentiate the types of MIL problems. Until now, these problem characteristics have not been formally identified and described. As a result, the variations in performance of MIL algorithms from one data set to another are difficult to explain. In this paper, MIL problem characteristics are grouped into four broad categories: the composition of the bags, the types of data distribution, the ambiguity of instance labels, and the task to be performed. Methods specialized to address each category are reviewed. Then, the extent to which these characteristics manifest themselves in key MIL application areas are described. Finally, experiments are conducted to compare the performance of 16 state-of-the-art MIL methods on selected problem characteristics. This paper provides insight on how the problem characteristics affect MIL algorithms, recommendations for future benchmarking and promising avenues for research

arXiv.org e-Print Archive

Machine Learning for Auditory Hierarchy

Author: Coleman William
Publication venue: Technological University Dublin
Publication date: 01/01/2021
Field of study

Coleman, W. (2021). Machine Learning for Auditory Hierarchy. This dissertation is submitted for the degree of Doctor of Philosophy, Technological University Dublin. Audio content is predominantly delivered in a stereo audio file of a static, pre-formed mix. The content creator makes volume, position and effects decisions, generally for presentation in stereo speakers, but has no control ultimately over how the content will be consumed. This leads to poor listener experience when, for example, a feature film is mixed such that the dialogue is at a low level relative to the sound effects. Consumers can complain that they must turn the volume up to hear the words, but back down again because the effects levels are too loud. Addressing this problem requires a television mix optimised for the stereo speakers used in the vast majority of homes, which is not always available

Arrow@TUDublin

Sound design: an artificial intelligence approach

Author: Miranda Eduardo Reck
Publication venue: The University of Edinburgh
Publication date: 01/01/1995
Field of study

Edinburgh Research Archive

Predicting the emotions expressed in music

Author: Madsen Jens
Publication venue: Technical University of Denmark
Publication date: 01/01/2016
Field of study

Online Research Database In Technology

State-of-the-Art Sensors Technology in Spain 2015: Volume 1

Author: Gonzalo Pajares Martinsanz
Publication venue: MDPI - Multidisciplinary Digital Publishing Institute
Publication date: 28/04/2017
Field of study

This book provides a comprehensive overview of state-of-the-art sensors technology in specific leading areas. Industrial researchers, engineers and professionals can find information on the most advanced technologies and developments, together with data processing. Further research covers specific devices and technologies that capture and distribute data to be processed by applying dedicated techniques or procedures, which is where sensors play the most important role. The book provides insights and solutions for different problems covering a broad spectrum of possibilities, thanks to a set of applications and solutions based on sensory technologies. Topics include: • Signal analysis for spectral power • 3D precise measurements • Electromagnetic propagation • Drugs detection • e-health environments based on social sensor networks • Robots in wireless environments, navigation, teleoperation, object grasping, demining • Wireless sensor networks • Industrial IoT • Insights in smart cities • Voice recognition • FPGA interfaces • Flight mill device for measurements on insects • Optical systems: UV, LEDs, lasers, fiber optics • Machine vision • Power dissipation • Liquid level in fuel tanks • Parabolic solar tracker • Force sensors • Control for a twin roto

Directory of Open Access Books (DOAB)

Recommended from our members

Natural-language video description with deep recurrent neural networks

Author: Venugopalan Subhashini
Publication venue
Publication date: 13/12/2017
Field of study

For most people, watching a brief video and describing what happened (in words) is an easy task. For machines, extracting meaning from video pixels and generating a sentence description is a very complex problem. The goal of this thesis is to develop models that can automatically generate natural language descriptions for events in videos. It presents several approaches to automatic video description by building on recent advances in “deep” machine learning. The techniques presented in this thesis view the task of video description akin to machine translation, treating the video domain as a source “language” and uses deep neural net architectures to “translate” videos to text. Specifically, I develop video captioning techniques using a unified deep neural network with both convolutional and recurrent structure, modeling the temporal elements in videos and language with deep recurrent neural networks. In my initial approach, I adapt a model that can learn from paired images and captions to transfer knowledge from this auxiliary task to generate descriptions for short video clips. Next, I present an end-to-end deep network that can jointly model a sequence of video frames and a sequence of words. To further improve grammaticality and descriptive quality, I also propose methods to integrate linguistic knowledge from plain text corpora. Additionally, I show that such linguistic knowledge can help describe novel objects unseen in paired image/video-caption data. Finally, moving beyond short video clips, I present methods to process longer multi-activity videos, specifically to jointly segment and describe coherent event sequences in movies.Computer Science

Texas ScholarWorks