1,140 research outputs found

    Transportation mode recognition fusing wearable motion, sound and vision sensors

    Get PDF
    We present the first work that investigates the potential of improving the performance of transportation mode recognition through fusing multimodal data from wearable sensors: motion, sound and vision. We first train three independent deep neural network (DNN) classifiers, which work with the three types of sensors, respectively. We then propose two schemes that fuse the classification results from the three mono-modal classifiers. The first scheme makes an ensemble decision with fixed rules including Sum, Product, Majority Voting, and Borda Count. The second scheme is an adaptive fuser built as another classifier (including Naive Bayes, Decision Tree, Random Forest and Neural Network) that learns enhanced predictions by combining the outputs from the three mono-modal classifiers. We verify the advantage of the proposed method with the state-of-the-art Sussex-Huawei Locomotion and Transportation (SHL) dataset recognizing the eight transportation activities: Still, Walk, Run, Bike, Bus, Car, Train and Subway. We achieve F1 scores of 79.4%, 82.1% and 72.8% with the mono-modal motion, sound and vision classifiers, respectively. The F1 score is remarkably improved to 94.5% and 95.5% by the two data fusion schemes, respectively. The recognition performance can be further improved with a post-processing scheme that exploits the temporal continuity of transportation. When assessing generalization of the model to unseen data, we show that while performance is reduced - as expected - for each individual classifier, the benefits of fusion are retained with performance improved by 15 percentage points. Besides the actual performance increase, this work, most importantly, opens up the possibility for dynamically fusing modalities to achieve distinct power-performance trade-off at run time

    Magnetic and radar sensing for multimodal remote health monitoring

    Get PDF
    With the increased life expectancy and rise in health conditions related to aging, there is a need for new technologies that can routinely monitor vulnerable people, identify their daily pattern of activities and any anomaly or critical events such as falls. This paper aims to evaluate magnetic and radar sensors as suitable technologies for remote health monitoring purpose, both individually and fusing their information. After experiments and collecting data from 20 volunteers, numerical features has been extracted in both time and frequency domains. In order to analyse and verify the validation of fusion method for different classifiers, a Support Vector Machine with a quadratic kernel, and an Artificial Neural Network with one and multiple hidden layers have been implemented. Furthermore, for both classifiers, feature selection has been performed to obtain salient features. Using this technique along with fusion, both classifiers can detect 10 different activities with an accuracy rate of approximately 96%. In cases where the user is unknown to the classifier, an accuracy of approximately 92% is maintained

    Integrating Neuromuscular and Touchscreen Input for Machine Control

    Get PDF
    Current touchscreen interfaces are unable to distinguish between individual fingers or to determine poses associated with the user’s hand. This limits the use of touchscreens in recognizing user input. As discussed herein, a statistical model can be trained using training data that includes sensor readings known to be associated with various hand poses and gestures. The trained statistical model can be configured to determine arm, hand, and/or figure configurations and forces (e.g., handstates) based on sensor readings, e.g., obtained via a wearable device such as a wristband with wearable sensors. The statistical model can identify the input from the handstate detected by the wearable device. For example, the handstates can include identification of a portion of the hand that is interacting with the touchscreen, a user’s finger position relative to the touchscreen, an identification of which finger or fingers of the user’s hand are interacting with the touchscreen, etc. The handstates can be used to control any aspect(s) of the touchscreen or a connected device indirectly through the touchscreen

    Radar and RGB-depth sensors for fall detection: a review

    Get PDF
    This paper reviews recent works in the literature on the use of systems based on radar and RGB-Depth (RGB-D) sensors for fall detection, and discusses outstanding research challenges and trends related to this research field. Systems to detect reliably fall events and promptly alert carers and first responders have gained significant interest in the past few years in order to address the societal issue of an increasing number of elderly people living alone, with the associated risk of them falling and the consequences in terms of health treatments, reduced well-being, and costs. The interest in radar and RGB-D sensors is related to their capability to enable contactless and non-intrusive monitoring, which is an advantage for practical deployment and users’ acceptance and compliance, compared with other sensor technologies, such as video-cameras, or wearables. Furthermore, the possibility of combining and fusing information from The heterogeneous types of sensors is expected to improve the overall performance of practical fall detection systems. Researchers from different fields can benefit from multidisciplinary knowledge and awareness of the latest developments in radar and RGB-D sensors that this paper is discussing

    A Survey of Multimodal Information Fusion for Smart Healthcare: Mapping the Journey from Data to Wisdom

    Full text link
    Multimodal medical data fusion has emerged as a transformative approach in smart healthcare, enabling a comprehensive understanding of patient health and personalized treatment plans. In this paper, a journey from data to information to knowledge to wisdom (DIKW) is explored through multimodal fusion for smart healthcare. We present a comprehensive review of multimodal medical data fusion focused on the integration of various data modalities. The review explores different approaches such as feature selection, rule-based systems, machine learning, deep learning, and natural language processing, for fusing and analyzing multimodal data. This paper also highlights the challenges associated with multimodal fusion in healthcare. By synthesizing the reviewed frameworks and theories, it proposes a generic framework for multimodal medical data fusion that aligns with the DIKW model. Moreover, it discusses future directions related to the four pillars of healthcare: Predictive, Preventive, Personalized, and Participatory approaches. The components of the comprehensive survey presented in this paper form the foundation for more successful implementation of multimodal fusion in smart healthcare. Our findings can guide researchers and practitioners in leveraging the power of multimodal fusion with the state-of-the-art approaches to revolutionize healthcare and improve patient outcomes.Comment: This work has been submitted to the ELSEVIER for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    SFusion: Self-attention based N-to-One Multimodal Fusion Block

    Full text link
    People perceive the world with different senses, such as sight, hearing, smell, and touch. Processing and fusing information from multiple modalities enables Artificial Intelligence to understand the world around us more easily. However, when there are missing modalities, the number of available modalities is different in diverse situations, which leads to an N-to-One fusion problem. To solve this problem, we propose a self-attention based fusion block called SFusion. Different from preset formulations or convolution based methods, the proposed block automatically learns to fuse available modalities without synthesizing or zero-padding missing ones. Specifically, the feature representations extracted from upstream processing model are projected as tokens and fed into self-attention module to generate latent multimodal correlations. Then, a modal attention mechanism is introduced to build a shared representation, which can be applied by the downstream decision model. The proposed SFusion can be easily integrated into existing multimodal analysis networks. In this work, we apply SFusion to different backbone networks for human activity recognition and brain tumor segmentation tasks. Extensive experimental results show that the SFusion block achieves better performance than the competing fusion strategies. Our code is available at https://github.com/scut-cszcl/SFusion.Comment: This paper has been accepted by MICCAI 202

    Unstructured Handwashing Recognition using Smartwatch to Reduce Contact Transmission of Pathogens

    Full text link
    Current guidelines from the World Health Organization indicate that the SARS-CoV-2 coronavirus, which results in the novel coronavirus disease (COVID-19), is transmitted through respiratory droplets or by contact. Contact transmission occurs when contaminated hands touch the mucous membrane of the mouth, nose, or eyes so hands hygiene is extremely important to prevent the spread of the SARSCoV-2 as well as of other pathogens. The vast proliferation of wearable devices, such as smartwatches, containing acceleration, rotation, magnetic field sensors, etc., together with the modern technologies of artificial intelligence, such as machine learning and more recently deep-learning, allow the development of accurate applications for recognition and classification of human activities such as: walking, climbing stairs, running, clapping, sitting, sleeping, etc. In this work, we evaluate the feasibility of a machine learning based system which, starting from inertial signals collected from wearable devices such as current smartwatches, recognizes when a subject is washing or rubbing its hands. Preliminary results, obtained over two different datasets, show a classification accuracy of about 95% and of about 94% for respectively deep and standard learning techniques
    • …
    corecore