8 research outputs found

    Voluntary cough detection by internal sound analysis

    Full text link

    Towards using Cough for Respiratory Disease Diagnosis by leveraging Artificial Intelligence: A Survey

    Full text link
    Cough acoustics contain multitudes of vital information about pathomorphological alterations in the respiratory system. Reliable and accurate detection of cough events by investigating the underlying cough latent features and disease diagnosis can play an indispensable role in revitalizing the healthcare practices. The recent application of Artificial Intelligence (AI) and advances of ubiquitous computing for respiratory disease prediction has created an auspicious trend and myriad of future possibilities in the medical domain. In particular, there is an expeditiously emerging trend of Machine learning (ML) and Deep Learning (DL)-based diagnostic algorithms exploiting cough signatures. The enormous body of literature on cough-based AI algorithms demonstrate that these models can play a significant role for detecting the onset of a specific respiratory disease. However, it is pertinent to collect the information from all relevant studies in an exhaustive manner for the medical experts and AI scientists to analyze the decisive role of AI/ML. This survey offers a comprehensive overview of the cough data-driven ML/DL detection and preliminary diagnosis frameworks, along with a detailed list of significant features. We investigate the mechanism that causes cough and the latent cough features of the respiratory modalities. We also analyze the customized cough monitoring application, and their AI-powered recognition algorithms. Challenges and prospective future research directions to develop practical, robust, and ubiquitous solutions are also discussed in detail.Comment: 30 pages, 12 figures, 9 table

    Cough Monitoring Through Audio Analysis

    Get PDF
    The detection of cough events in audio recordings requires the analysis of a significant amount of data as cough is typically monitored continuously over several hours to capture naturally occurring cough events. The recorded data is mostly composed of undesired sound events such as silence, background noise, and speech. To reduce computational costs and to address the ethical concerns raised from the collection of audio data in public environments, the data requires pre-processing prior to any further analysis. Current cough detection algorithms typically use pre-processing methods to remove undesired audio segments from the collected data but do not preserve the privacy of individuals being recorded while monitoring respiratory events. This study reveals the need for an automatic pre-processing method that removes sensitive data from the recording prior to any further analysis to ensure privacy preservation of individuals. Specific characteristics of cough sounds can be used to discard sensitive data from audio recordings at a pre-processing stage, improving privacy preservation, and decreasing ethical concerns when dealing with cough monitoring through audio analysis. We propose a pre-processing algorithm that increases privacy preservation and significantly decreases the amount of data to be analysed, by separating cough segments from other non-cough segments, including speech, in audio recordings. Our method verifies the presence of signal energy in both lower and higher frequency regions and discards segments whose energy concentrates only on one of them. The method is iteratively applied on the same data to increase the percentage of data reduction and privacy preservation. We evaluated the performance of our algorithm using several hours of audio recordings with manually pre-annotated cough and speech events. Our results showed that 5 iterations of the proposed method can discard up to 88.94% of the speech content present in the recordings, allowing for a strong privacy preservation while considerably reducing the amount of data to be further analysed by 91.79%. The data reduction and privacy preservation achievements of the proposed pre-processing algorithm offers the possibility to use larger datasets captured in public environments and would beneficiate all cough detection algorithms by preserving the privacy of subjects and by-stander conversations recorded during cough monitoring

    FSD50K: an Open Dataset of Human-Labeled Sound Events

    Full text link
    Most existing datasets for sound event recognition (SER) are relatively small and/or domain-specific, with the exception of AudioSet, based on a massive amount of audio tracks from YouTube videos and encompassing over 500 classes of everyday sounds. However, AudioSet is not an open dataset---its release consists of pre-computed audio features (instead of waveforms), which limits the adoption of some SER methods. Downloading the original audio tracks is also problematic due to constituent YouTube videos gradually disappearing and usage rights issues, which casts doubts over the suitability of this resource for systems' benchmarking. To provide an alternative benchmark dataset and thus foster SER research, we introduce FSD50K, an open dataset containing over 51k audio clips totalling over 100h of audio manually labeled using 200 classes drawn from the AudioSet Ontology. The audio clips are licensed under Creative Commons licenses, making the dataset freely distributable (including waveforms). We provide a detailed description of the FSD50K creation process, tailored to the particularities of Freesound data, including challenges encountered and solutions adopted. We include a comprehensive dataset characterization along with discussion of limitations and key factors to allow its audio-informed usage. Finally, we conduct sound event classification experiments to provide baseline systems as well as insight on the main factors to consider when splitting Freesound audio data for SER. Our goal is to develop a dataset to be widely adopted by the community as a new open benchmark for SER research

    日常生活音からのリアルタイムADL 認識方法の研究

    Get PDF
    人間の行動や心情などを基にして,状況に応じて最適な制御ができるサービスが注目されているが,そのサービスを有用なものにするには,高次情報を得るためにセンサデータから得る低次情報が重要になる.そこで本研究では,ADL(日常生活行動)や心情などの把握を目的として,生活音や非言語音を話声や雑音と識別しながらリアルタイム認識ができるシステムの開発を行った.多種類の非言語音および生活音を対象としてリアルタイム認識を行った先行研究において,使われた認識手法によって,本研究で認識したい音声に対しても使えるかどうかについて検証した.その結果,「話声と非言語音が共存していないこと」や「雑音入力による誤検出対策が行われていない」という課題があり,さらにその手法が話声や非言語音の認識に向いていないという仮説を得た.そこで,「話声や雑音と識別するための手法」や「非言語音認識に適した状態定義手法」に関する既存研究について調査し,リアルタイム認識時の要件についても考慮した上で認識手法を提案した.さらに,提案手法に合った音声認識エンジンを用いてリアルタイム認識の実装を行うことにした.提案手法の認識精度を検証することを目的に,様々な話者や環境下での音声を使って3 種類の評価を行った.その結果,疑似音素列定義による非言語音同士での分類はそれなりの結果となったものの,連続音声からのリアルタイム認識を想定した処理を含めた場合,「非言語音の検出率」や「雑音入力による非言語音の誤検出」に関して課題が残った.その一方で話声による生活音および非言語音の誤検出は抑えることができたことに加えて,生活音については1 種類を除いて比較的正確な認識ができていた.また発話中に笑った場合でも,リアルタイム認識時と同様の設定で約65%の割合で笑いを検出することができたため,連続音声からのリアルタイム笑い声検出には本手法が有効になると考えた.電気通信大学201

    Audio and contact microphones for cough detection

    No full text
    In the framework of assessing the pathology severity in chronic cough diseases, medical literature underlines the lack of tools for allowing the automatic, objective and reliable detection of cough events. This paper describes a system based on two microphones which we developed for this purpose. The proposed approach relies on a large variety of audio descriptors, an efficient algorithm of feature selection based on their mutual information and the use of artificial neural networks. First, the possible use of a contact microphone (placed on the patient's thorax or trachea) in complement to the audio signal is investigated. This study underlines that this contact microphone suffers from reliability issues, and conveys little new relevant information compared to the audio modality. Secondly, the proposed audioonly approach is compared to a commercially available system using four sensors on a database with different sound categories often misdetected as coughs, and produced in various conditions. With average sensitivity and specificity of 94.7% and 95% respectively, the proposed method achieves better cough detection performance than the commercial system
    corecore