38 research outputs found

    Towards a Practical Pedestrian Distraction Detection Framework using Wearables

    Full text link
    Pedestrian safety continues to be a significant concern in urban communities and pedestrian distraction is emerging as one of the main causes of grave and fatal accidents involving pedestrians. The advent of sophisticated mobile and wearable devices, equipped with high-precision on-board sensors capable of measuring fine-grained user movements and context, provides a tremendous opportunity for designing effective pedestrian safety systems and applications. Accurate and efficient recognition of pedestrian distractions in real-time given the memory, computation and communication limitations of these devices, however, remains the key technical challenge in the design of such systems. Earlier research efforts in pedestrian distraction detection using data available from mobile and wearable devices have primarily focused only on achieving high detection accuracy, resulting in designs that are either resource intensive and unsuitable for implementation on mainstream mobile devices, or computationally slow and not useful for real-time pedestrian safety applications, or require specialized hardware and less likely to be adopted by most users. In the quest for a pedestrian safety system that achieves a favorable balance between computational efficiency, detection accuracy, and energy consumption, this paper makes the following main contributions: (i) design of a novel complex activity recognition framework which employs motion data available from users' mobile and wearable devices and a lightweight frequency matching approach to accurately and efficiently recognize complex distraction related activities, and (ii) a comprehensive comparative evaluation of the proposed framework with well-known complex activity recognition techniques in the literature with the help of data collected from human subject pedestrians and prototype implementations on commercially-available mobile and wearable devices

    Resource consumption analysis of online activity recognition on mobile phones and smartwatches

    Get PDF
    Most of the studies on human activity recognition using smartphones and smartwatches are performed in an offline manner. In such studies, collected data is analyzed in machine learning tools with less focus on the resource consumption of these devices for running an activity recognition system. In this paper, we analyze the resource consumption of human activity recognition on both smartphones and smartwatches, considering six different classifiers, three different sensors, different sampling rates and window sizes. We study the CPU, memory and battery usage with different parameters, where the smartphone is used to recognize seven physical activities and the smartwatch is used to recognize smoking activity. As a result of this analysis, we report that classification function takes a very small amount of CPU time out of total app’s CPU time while sensing and feature calculation consume most of it. When an additional sensor is used besides an accelerometer, such as gyroscope, CPU usage increases significantly. Analysis results also show that increasing the window size reduces the resource consumption more than reducing the sampling rate. As a final remark, we observe that a more complex model using only the accelerometer is a better option than using a simple model with both accelerometer and gyroscope when resource usage is to be reduced

    Sampling Strategies for Tackling Imbalanced Data in Human Activity Recognition

    Get PDF
    Human activity recognition (HAR) using wearable sensors is a topic that is being actively researched in machine learning. Smart, sensor-embedded devices, such as smartphones, fitness trackers, or smart watches that collect detailed data on movement, are widely available now. HAR may be applied in areas such as healthcare, physiotherapy, and fitness to assist users of these smart devices in their daily lives. However, one of the main challenges facing HAR, particularly when it is used in supervised learning, is how balanced data may be obtained for algorithm optimisation and testing. Because users engage in some activities more than others, e.g. walking more than running, HAR datasets are typically imbalanced. The lack of dataset representation from minority classes, therefore, hinders the ability of HAR classifiers to sufficiently capture new instances of those activities. Inspired by the concept of data fusion, this thesis will introduce three new hybrid sampling methods. Thus, the diversity of the synthesised samples will be enhanced by combining output from separate sampling methods into three hybrid approaches. The advantage of the hybrid method is that it provides diverse synthetic data that can increase the size of the training data from different sampling approaches. This leads to improvements in the generalisation of a learning activity recognition model. The first strategy, known as the (DBM), combines synthetic minority oversampling techniques (SMOTE) with Random_SMOTE, both of which are built around the k-nearest neighbours algorithm. The second technique, called the noise detection-based method (NDBM), combines Tomek links (SMOTE_Tomeklinks) and the modified synthetic minority oversampling technique (MSMOTE). The third approach, titled the cluster-based method (CBM), combines cluster-based synthetic oversampling (CBSO) and the proximity weighted synthetic oversampling technique (ProWSyn). The performance of the proposed hybrid methods is compared with existing methods using accelerometer data from three commonly used benchmark datasets. The results show that the DBM, NDBM and CBM can significantly reduce the impact of class imbalance and enhance F1 scores of the multilayer perceptron (MLP) by as much as 9 % to 20 % compared with their constituent sampling methods. Also, the Friedman statistical significance test was conducted to compare the effect of the different sampling methods. The test results confirm that the CBM is more effective than the other sampling approaches. This thesis also introduces a method based on the Wasserstein generative adversarial network (WGAN) for generating different types of data on human activity. The WGAN is more stable to train than a generative adversarial network (GAN) and this is due to the use of a stable metric, namely Wasserstein distance, to compare the similarity between the real data distribution with the generated data distribution. WGAN is a deep learning approach, and in contrast to the six existing sampling methods referred to previously, it can operate on raw sensor data as convolutional and recurrent layers can act as feature extractors. WGAN is used to generate raw sensor data to overcome the limitations of the traditional machine learning-based sampling methods that can only operate on extracted features. The synthetic data that is produced by WGAN is then used to oversample the imbalanced training data. This thesis demonstrates that this approach significantly enhances the learning ability of the convolutional neural network(CNN) by as much as 5 % to 6 % from imbalanced human activity datasets. This thesis concludes that the proposed sampling methods based on traditional machine learning are efficient when human activity training data is imbalanced and small. These methods are less complex to implement, require less human activity training data to produce synthetic data and fewer computational resources than the WGAN approach. The proposed WGAN method is effective at producing raw sensor data when a large quantity of human activity training data is available. Additionally, it is time-consuming to optimise the hyperparameters related to the WGAN architecture, which significantly impacts the performance of the method

    Synthetic Sensor Data for Human Activity Recognition

    Get PDF
    Human activity recognition (HAR) based on wearable sensors has emerged as an active topic of research in machine learning and human behavior analysis because of its applications in several fields, including health, security and surveillance, and remote monitoring. Machine learning algorithms are frequently applied in HAR systems to learn from labeled sensor data. The effectiveness of these algorithms generally relies on having access to lots of accurately labeled training data. But labeled data for HAR is hard to come by and is often heavily imbalanced in favor of one or other dominant classes, which in turn leads to poor recognition performance. In this study we introduce a generative adversarial network (GAN)-based approach for HAR that we use to automatically synthesize balanced and realistic sensor data. GANs are robust generative networks, typically used to create synthetic images that cannot be distinguished from real images. Here we explore and construct a model for generating several types of human activity sensor data using a Wasserstein GAN (WGAN). We assess the synthetic data using two commonly-used classifier models, Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM). We evaluate the quality and diversity of the synthetic data by training on synthetic data and testing on real sensor data, and vice versa. We then use synthetic sensor data to oversample the imbalanced training set. We demonstrate the efficacy of the proposed method on two publicly available human activity datasets, the Sussex-Huawei Locomotion (SHL) and Smoking Activity Dataset (SAD). We achieve improvements of using WGAN augmented training data over the imbalanced case, for both SHL (0.85 to 0.95 F1-score), and for SAD (0.70 to 0.77 F1-score) when using a CNN activity classifier

    From Cellular to Holistic: Development of Algorithms to Study Human Health and Diseases

    Get PDF
    The development of theoretical computational methods and their application has become widespread in the world today. In this dissertation, I present my work in the creation of models to detect and describe complex biological and health related problems. The first major part of my work centers around the creation and enhancement of methods to calculate protein structure and dynamics. To this end, substantial enhancement has been made to the software package REDCRAFT to better facilitate its usage in protein structure calculation. The enhancements have led to an overall increase in its ability to characterize proteins under difficult conditions such as high noise and low data density. Secondly, a database that allows for easy and comprehensive mining of protein structures has been created and deployed. We show preliminary results for its application to protein structure calculation. This database, among other applications, can be used to create input sets for computational models for prediction of protein structure. Lastly, I present my work on the creation of a theoretical model to describe discrete state protein dynamics. The results of this work can be used to describe many real-world dynamic systems. The second major part of my work centers around the application of machine learning techniques to create a system for the automated detection of smoking using accelerometer data from smartwatches. The first aspect of this work that will be presented is binary detection of smoking puffs. This model was then expanded to perform full cigarette session detection. Next, the model was reformulated to perform quantification of smoking (such as puff duration and the time between puffs). Lastly, a rotational matrix was derived to resolve ambiguities of smartwatches due to position of the watch on the wrist

    Subject-dependent and -independent human activity recognition with person-specific and -independent models

    Get PDF
    The distinction between subject-dependent and subject-independent performance is ubiquitous in the Human Activity Recognition (HAR) literature. We test the hypotheses that HAR models achieve better subject-dependent performance than subject-independent performance, that a model trained with many users will achieve better subject-independent performance than one trained with a single user, and that one trained with a single user performs better for that user than one trained with this and other users by comparing four algorithms' subject-dependent and -independent performance across eight data sets using three different approaches, which we term person-independent models (PIMs), person-specific models (PSMs), and ensembles of PSMs (EPSMs). Our analysis shows that PSMs outperform PIMs by 3.5% for known users, PIMs outperform PSMs by 13.9% and ensembles of PSMs by a not significant 2.1% for unknown users, and that the performance for known users is 20.5% to 48% better than for unknown users
    corecore