277 research outputs found

    BigEAR: Inferring the Ambient and Emotional Correlates from Smartphone-based Acoustic Big Data

    Get PDF
    This paper presents a novel BigEAR big data framework that employs psychological audio processing chain (PAPC) to process smartphone-based acoustic big data collected when the user performs social conversations in naturalistic scenarios. The overarching goal of BigEAR is to identify moods of the wearer from various activities such as laughing, singing, crying, arguing, and sighing. These annotations are based on ground truth relevant for psychologists who intend to monitor/infer the social context of individuals coping with breast cancer. We pursued a case study on couples coping with breast cancer to know how the conversations affect emotional and social well being. In the state-of-the-art methods, psychologists and their team have to hear the audio recordings for making these inferences by subjective evaluations that not only are time-consuming and costly, but also demand manual data coding for thousands of audio files. The BigEAR framework automates the audio analysis. We computed the accuracy of BigEAR with respect to the ground truth obtained from a human rater. Our approach yielded overall average accuracy of 88.76% on real-world data from couples coping with breast cancer.Comment: 6 pages, 10 equations, 1 Table, 5 Figures, IEEE International Workshop on Big Data Analytics for Smart and Connected Health 2016, June 27, 2016, Washington DC, US

    Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

    Full text link
    In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in Smart Healthcare (2017), Springe

    Combining degradations: The effect of background noise on intelligibility of disordered speech

    Get PDF
    The effect of background noise on intelligibility of disordered speech was assessed. Speech-shaped noise was mixed with neurologically healthy (control) and disordered (dysarthric) speech at a series of signal-to-noise ratios. In addition, bandpass filtered control and dysarthric speech conditions were assessed to determine the effect of noise on both naturally and artificially degraded speech. While significant effects of both the amount of noise and the type of speech were revealed, no interaction between the two factors was observed, in either the broadband or filtered testing conditions. Thus, it appears that there is no multiplicative effect of the presence of background noise on intelligibility of disordered speech relative to control speech. That is, the decrease in intelligibility due to increasing levels of noise is similar for both types of speech, and both types of testing conditions, and the function for dysarthric speech is simply shifted downward due to the inherent source degradations of the speech itself. Last, large-scale online crowdsourcing via Amazon Mechanical Turk was utilized to collect data for the current study. Findings and implications for this data and data collection approach are discussed

    The Consequences of Oromandibular Dystonia on Communicative Participation: A Qualitative Study of the Insider\u27s Experiences

    Get PDF
    The purpose of this study was to explore the consequences of oromandibular dystonia (OMD) on communicative participation from the insider’s perspective. Qualitative research methods were used to obtain a self-reported account of the experience of living with OMD. Eight individuals with OMD and dysarthria participated in face-to-face phenomenological interviews. Interviews were transcribed from audio recordings and coded using coding software. The codes were then grouped into larger thematic categories based on salience. Results showed that communicative participation is affected by multiple physical, social, and emotional factors caused by OMD. Furthermore, OMD can have significant effects on an individual’s job, family, and social life. Lastly, strategies and coping mechanisms used by participants were explored. This study will add to very sparse literature on OMD and will help to reveal the complexity of living with this disorder

    The role of the physical environment in conversations between people who are communication vulnerable and health-care professionals: a scoping review.

    Get PDF
    PURPOSE: The role of the physical environment in communication between health-care professionals and persons with communication problems is a neglected area. This study provides an overview of factors in the physical environment that play a role in communication during conversations between people who are communication vulnerable and health-care professionals. METHOD: A scoping review was conducted using the methodological framework of Arksey and O'Malley. The PubMed, PsycINFO, CINAHL and Cochrane Library databases were screened, and a descriptive and thematic analysis was completed. RESULTS: Sixteen publications were included. Six factors in the physical environment play a role in conversations between people who are communication vulnerable and health-care professionals: (1) lighting, (2) acoustic environment, (3) humidity and temperature, (4) setting and furniture placement, (5) written information, and (6) availability of augmentative and alternative communication (AAC) tools. These factors indicated barriers and strategies related to the quality of these conversations. CONCLUSIONS: Relatively small and simple strategies to adjust the physical environment (such as adequate lighting, quiet environment, providing pen and paper) can support people who are communication vulnerable to be more involved in conversations. It is recommended that health-care professionals have an overall awareness of the potential influence of environmental elements on conversations. Implications for rehabilitation The physical environment is an important feature in the success or disturbance of communication. Small adjustments to the physical environment in rehabilitation can contribute to a communication-friendly environment for conversations with people who are communication vulnerable. Professionals should consider adjustments with regard to the following factors in the physical environment during conversations with people who are communication vulnerable: lighting, acoustic environment, humidity and temperature, setting and furniture placement, written information, and availability of AAC (augmentative and alternative communication tools)

    Modeling Sub-Band Information Through Discrete Wavelet Transform to Improve Intelligibility Assessment of Dysarthric Speech

    Get PDF
    The speech signal within a sub-band varies at a fine level depending on the type, and level of dysarthria. The Mel-frequency filterbank used in the computation process of cepstral coefficients smoothed out this fine level information in the higher frequency regions due to the larger bandwidth of filters. To capture the sub-band information, in this paper, four-level discrete wavelet transform (DWT) decomposition is firstly performed to decompose the input speech signal into approximation and detail coefficients, respectively, at each level. For a particular input speech signal, five speech signals representing different sub-bands are then reconstructed using inverse DWT (IDWT). The log filterbank energies are computed by analyzing the short-term discrete Fourier transform magnitude spectra of each reconstructed speech using a 30-channel Mel-filterbank. For each analysis frame, the log filterbank energies obtained across all reconstructed speech signals are pooled together, and discrete cosine transform is performed to represent the cepstral feature, here termed as discrete wavelet transform reconstructed (DWTR)- Mel frequency cepstral coefficient (MFCC). The i-vector based dysarthric level assessment system developed on the universal access speech corpus shows that the proposed DTWRMFCC feature outperforms the conventional MFCC and several other cepstral features reported for a similar task. The usages of DWTR- MFCC improve the detection accuracy rate (DAR) of the dysarthric level assessment system in the text and the speaker-independent test case to 60.094 % from 56.646 % MFCC baseline. Further analysis of the confusion matrices shows that confusion among different dysarthric classes is quite different for MFCC and DWTR-MFCC features. Motivated by this observation, a two-stage classification approach employing discriminating power of both kinds of features is proposed to improve the overall performance of the developed dysarthric level assessment system. The two-stage classification scheme further improves the DAR to 65.813 % in the text and speaker- independent test case

    Maintenance of speech in Parkinson’s disease: The impact of group therapy

    Get PDF

    A Comparison of Speech Amplification Devices for Individuals with Parkinson\u27s Disease and Hypophonia

    Get PDF
    One of the most prevalent speech impairments in idiopathic Parkinson’s disease (PD) is hypophonia, a reduction in intensity, which typically decreases intelligibility. Speech amplification devices are a potential solution; however, despite the availability of a broad range of devices, no previous studies systematically compare their efficacy in PD. This study examined the effects of speech task (Sentence Intelligibility Test versus conversation), background noise (no noise versus 65 dB SPL multi-talker noise), and selected devices (ADDvox, BoomVox, ChatterVox, Oticon, SoniVox, Spokeman, and Voicette) for 11 PD and 10 control participants, using outcome measures of speech intensity, speech-to-noise ratio, intelligibility, sound quality, and speakers’ experience. There were significant differences between the outcome measures for different device types, but experience scores did not always predict effectiveness according to the device hierarchy for the outcome measures. Future research is needed to determine performance and preference measures that will predict long-term device acceptance in PD

    Multimodal Data Fusion of Electromyography and Acoustic Signals for Thai Syllable Recognition

    Get PDF
    Speech disorders such as dysarthria are common and frequent after suffering a stroke. Speech rehabilitation performed by a speech-language pathologist is needed to improve and recover. However, in Thailand, there is a shortage of speech-language pathologists. In this paper, we present a syllable recognition system, which can be deployable in a speech rehabilitation system to provide support to the limited speech-language pathologists available. The proposed system is based on a multimodal fusion of acoustic signal and surface electromyography (sEMG) collected from facial muscles. Multimodal data fusion is studied to improve signal collection under noisy situations while reducing the number of electrodes needed. The signals are simultaneously collected while articulating 12 Thai syllables designed for rehabilitation exercises. Several features are extracted from sEMG signals and five channels are studied. The best combination of features and channels is chosen to be fused with the mel-frequency cepstral coefficients extracted from the acoustic signal. The feature vector from each signal source is projected by spectral regression extreme learning machine and concatenated. Data from seven healthy subjects were collected for evaluation purposes. Results show that the multimodal fusion outperforms the use of a single signal source achieving up to 98% of accuracy. In other words, an accuracy improvement up to 5% can be achieved when using the proposed multimodal fusion. Moreover, its low standard deviations in classification accuracy compared to those from the unimodal fusion indicate the improvement in the robustness of the syllable recognition
    • …
    corecore