44 research outputs found

    Benchmarking the SHL Recognition Challenge with classical and deep-learning pipelines

    Get PDF
    In this paper we, as part of the Sussex-Huawei Locomotion-Transportation (SHL) Recognition Challenge organizing team, present reference recognition performance obtained by applying various classical and deep-learning classifiers to the testing dataset. We aim to recognize eight modes of transportation (Still, Walk, Run, Bike, Bus, Car, Train, Subway) from smartphone inertial sensors: accelerometer, gyroscope and magnetometer. The classical classifiers include naive Bayesian, decision tree, random forest, K-nearest neighbour and support vector machine, while the deep-learning classifiers include fully-connected and convolutional deep neural networks. We feed different types of input to the classifier, including hand-crafted features, raw sensor data in the time domain, and in the frequency domain. We employ a post-processing scheme to improve the recognition performance. Results show that convolutional neural network operating on frequency domain raw data achieves the best performance among all the classifiers

    Sound-based transportation mode recognition with smartphones

    Get PDF
    Smartphone-based identification of the mode of transportation of the user is important for context-aware services. We investigate the feasibility of recognizing the 8 most common modes of locomotion and transportation from the sound recorded by a smartphone carried by the user. We propose a convolutional neural network based recognition pipeline, which operates on the short- time Fourier transform (STFT) spectrogram of the sound in the log domain. Experiment with the Sussex-Huawei locomotion- transportation (SHL) dataset on 366 hours of data shows promising results where the proposed pipeline can recognize the activities Still, Walk, Run, Bike, Car, Bus, Train and Subway with a global accuracy of 86.6%, which is 23% higher than classical machine learning pipelines. It is shown that sound is particularly useful for distinguishing between various vehicle activities (e.g. Car vs Bus, Train vs Subway). This discriminablity is complementary to the widely used motion sensors, which are poor at distinguish between rail and road transport

    Summary of the Sussex-Huawei Locomotion-Transportation Recognition Challenge

    Get PDF
    In this paper we summarize the contributions of participants to the Sussex-Huawei Transportation-Locomotion (SHL) Recognition Challenge organized at the HASCA Workshop of UbiComp 2018. The SHL challenge is a machine learning and data science competition, which aims to recognize eight transportation activities (Still, Walk, Run, Bike, Bus, Car, Train, Subway) from the inertial and pressure sensor data of a smartphone. We introduce the dataset used in the challenge and the protocol for the competition. We present a meta-analysis of the contributions from 19 submissions, their approaches, the software tools used, computational cost and the achieved results. Overall, two entries achieved F1 scores above 90%, eight with F1 scores between 80% and 90%, and nine between 50% and 80%

    Summary of the Sussex-Huawei Locomotion-Transportation Recognition Challenge 2019

    Get PDF
    In this paper we summarize the contributions of participants to the third Sussex-Huawei Locomotion-Transportation (SHL) Recognition Challenge organized at the HASCAWorkshop of UbiComp/ISWC 2020. The goal of this machine learning/data science challenge is to recognize eight locomotion and transportation activities (Still, Walk, Run, Bike, Bus, Car, Train, Subway) from the inertial sensor data of a smartphone in a user-independent manner with an unknown target phone position. The training data of a “train” user is available from smartphones placed at four body positions (Hand, Torso, Bag and Hips). The testing data originates from “test” users with a smartphone placed at one, but unknown, body position. We introduce the dataset used in the challenge and the protocol of the competition. We present a meta-analysis of the contributions from 15 submissions, their approaches, the software tools used, computational cost and the achieved results. Overall, one submission achieved F1 scores above 80%, three with F1 scores between 70% and 80%, seven between 50% and 70%, and four below 50%, with a latency of maximum of 5 seconds

    Transportation mode recognition fusing wearable motion, sound and vision sensors

    Get PDF
    We present the first work that investigates the potential of improving the performance of transportation mode recognition through fusing multimodal data from wearable sensors: motion, sound and vision. We first train three independent deep neural network (DNN) classifiers, which work with the three types of sensors, respectively. We then propose two schemes that fuse the classification results from the three mono-modal classifiers. The first scheme makes an ensemble decision with fixed rules including Sum, Product, Majority Voting, and Borda Count. The second scheme is an adaptive fuser built as another classifier (including Naive Bayes, Decision Tree, Random Forest and Neural Network) that learns enhanced predictions by combining the outputs from the three mono-modal classifiers. We verify the advantage of the proposed method with the state-of-the-art Sussex-Huawei Locomotion and Transportation (SHL) dataset recognizing the eight transportation activities: Still, Walk, Run, Bike, Bus, Car, Train and Subway. We achieve F1 scores of 79.4%, 82.1% and 72.8% with the mono-modal motion, sound and vision classifiers, respectively. The F1 score is remarkably improved to 94.5% and 95.5% by the two data fusion schemes, respectively. The recognition performance can be further improved with a post-processing scheme that exploits the temporal continuity of transportation. When assessing generalization of the model to unseen data, we show that while performance is reduced - as expected - for each individual classifier, the benefits of fusion are retained with performance improved by 15 percentage points. Besides the actual performance increase, this work, most importantly, opens up the possibility for dynamically fusing modalities to achieve distinct power-performance trade-off at run time

    Sound-based transportation mode recognition with smartphones

    Get PDF
    Smartphone-based identification of the mode of transportation of the user is important for context-aware services. We investigate the feasibility of recognizing the 8 most common modes of locomotion and transportation from the sound recorded by a smartphone carried by the user. We propose a convolutional neural network based recognition pipeline, which operates on the short-time Fourier transform (STFT) spectrogram of the sound in the log domain. Experiment with the Sussex-Huawei locomotion-transportation (SHL) dataset on 366 hours of data shows promising results where the proposed pipeline can recognize the activities Still, Walk, Run, Bike, Car, Bus, Train and Subway with a global accuracy of 86.6%, which is 23% higher than classical machine learning pipelines. It is shown that sound is particularly useful for distinguishing between various vehicle activities (e.g. Car vs Bus, Train vs Subway). This discriminablity is complementary to the widely used motion sensors, which are poor at distinguish between rail and road transport

    Improving smartphone based transport mode recognition using generative adversarial networks

    Get PDF
    Wearable devices such as smartphones and smartwatches are widely used and record a significant amount of data. Labelling this data for human activity recognition is a time-consuming task, therefore methods which reduce the amount of labelled data required to train accurate classifiers are important. Generative Adversarial Networks (GANs) can be used to model the implicit distribution of a dataset. Traditional GANs, which only consist of a generator and a discriminator, result in networks able to generate synthetic data and distinguish real from fake samples. This adversarial game can be extended to include a classifier, which allows the training of the classification network to be enhanced with synthetic and unlabelled data. The network architecture presented in this paper is inspired by SenseGAN [1], but instead of generating and classifying sensor-recorded time-series data, our approach operates with extracted features, which drastically reduces the amount of stored and processed data and enables deployment on less powerful and potentially wearable devices. We show that this technique can be used to improve the classification performance of a classifier trained to recognise locomotion modes based on recorded acceleration data and that it reduces the amount of labelled training data necessary to achieve a similar performance compared to a baseline classifier. Specifically, our approach reached the same accuracy as the baseline classifier up to 50% faster and was able to achieve a 10% higher accuracy in the same number of epochs

    Sampling Strategies for Tackling Imbalanced Data in Human Activity Recognition

    Get PDF
    Human activity recognition (HAR) using wearable sensors is a topic that is being actively researched in machine learning. Smart, sensor-embedded devices, such as smartphones, fitness trackers, or smart watches that collect detailed data on movement, are widely available now. HAR may be applied in areas such as healthcare, physiotherapy, and fitness to assist users of these smart devices in their daily lives. However, one of the main challenges facing HAR, particularly when it is used in supervised learning, is how balanced data may be obtained for algorithm optimisation and testing. Because users engage in some activities more than others, e.g. walking more than running, HAR datasets are typically imbalanced. The lack of dataset representation from minority classes, therefore, hinders the ability of HAR classifiers to sufficiently capture new instances of those activities. Inspired by the concept of data fusion, this thesis will introduce three new hybrid sampling methods. Thus, the diversity of the synthesised samples will be enhanced by combining output from separate sampling methods into three hybrid approaches. The advantage of the hybrid method is that it provides diverse synthetic data that can increase the size of the training data from different sampling approaches. This leads to improvements in the generalisation of a learning activity recognition model. The first strategy, known as the (DBM), combines synthetic minority oversampling techniques (SMOTE) with Random_SMOTE, both of which are built around the k-nearest neighbours algorithm. The second technique, called the noise detection-based method (NDBM), combines Tomek links (SMOTE_Tomeklinks) and the modified synthetic minority oversampling technique (MSMOTE). The third approach, titled the cluster-based method (CBM), combines cluster-based synthetic oversampling (CBSO) and the proximity weighted synthetic oversampling technique (ProWSyn). The performance of the proposed hybrid methods is compared with existing methods using accelerometer data from three commonly used benchmark datasets. The results show that the DBM, NDBM and CBM can significantly reduce the impact of class imbalance and enhance F1 scores of the multilayer perceptron (MLP) by as much as 9 % to 20 % compared with their constituent sampling methods. Also, the Friedman statistical significance test was conducted to compare the effect of the different sampling methods. The test results confirm that the CBM is more effective than the other sampling approaches. This thesis also introduces a method based on the Wasserstein generative adversarial network (WGAN) for generating different types of data on human activity. The WGAN is more stable to train than a generative adversarial network (GAN) and this is due to the use of a stable metric, namely Wasserstein distance, to compare the similarity between the real data distribution with the generated data distribution. WGAN is a deep learning approach, and in contrast to the six existing sampling methods referred to previously, it can operate on raw sensor data as convolutional and recurrent layers can act as feature extractors. WGAN is used to generate raw sensor data to overcome the limitations of the traditional machine learning-based sampling methods that can only operate on extracted features. The synthetic data that is produced by WGAN is then used to oversample the imbalanced training data. This thesis demonstrates that this approach significantly enhances the learning ability of the convolutional neural network(CNN) by as much as 5 % to 6 % from imbalanced human activity datasets. This thesis concludes that the proposed sampling methods based on traditional machine learning are efficient when human activity training data is imbalanced and small. These methods are less complex to implement, require less human activity training data to produce synthetic data and fewer computational resources than the WGAN approach. The proposed WGAN method is effective at producing raw sensor data when a large quantity of human activity training data is available. Additionally, it is time-consuming to optimise the hyperparameters related to the WGAN architecture, which significantly impacts the performance of the method

    On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator

    Get PDF
    Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise

    Edistysaskeleita liikkeentunnistuksessa mobiililaitteilla

    Get PDF
    Motion sensing is one of the most important sensing capabilities of mobile devices, enabling monitoring physical movement of the device and associating the observed motion with predefined activities and physical phenomena. The present thesis is divided into three parts covering different facets of motion sensing techniques. In the first part of this thesis, we present techniques to identify the gravity component within three-dimensional accelerometer measurements. Our technique is particularly effective in the presence of sustained linear acceleration events. Using the estimated gravity component, we also demonstrate how the sensor measurements can be transformed into descriptive motion representations, able to convey information about sustained linear accelerations. To quantify sustained linear acceleration, we propose a set of novel peak features, designed to characterize movement during mechanized transportation. Using the gravity estimation technique and peak features, we proceed to present an accelerometer-based transportation mode detection system able to distinguish between fine-grained automotive modalities. In the second part of the thesis, we present a novel sensor-assisted method, crowd replication, for quantifying usage of a public space. As a key technical contribution within crowd replication, we describe construction and use of pedestrian motion models to accurately track detailed motion information. Fusing the pedestrian models with a positioning system and annotations about visual observations, we generate enriched trajectories able to accurately quantify usage of public spaces. Finally in the third part of the thesis, we present two exemplary mobile applications leveraging motion information. As the first application, we present a persuasive mobile application that uses transportation mode detection to promote sustainable transportation habits. The second application is a collaborative speech monitoring system, where motion information is used to monitor changes in physical configuration of the participating devices.Liikkeen havainnointi ja analysointi ovat keskeisimpiä kontekstitietoisten mobiililaitteiden ominaisuuksia. Tässä väitöskirjassa tarkastellaan kolmea eri liiketunnistuksen osa-aluetta. Väitöskirjan ensimmäinen osa käsittelee liiketunnistuksen menetelmiä erityisesti liikenteen ja ajoneuvojen saralla. Väitöskirja esittelee uusia menetelmiä gravitaatiokomponentin arviointiin tilanteissa, joissa laitteeseen kohdistuu pitkäkestoista lineaarista kiihtyvyyttä. Gravitaatiokomponentin tarkka arvio mahdollistaa ajoneuvon liikkeen erottelun muista laitteeseen kohdistuvista voimista. Menetelmän potentiaalin havainnollistamiseksi työssä esitellään kiihtyvyysanturipohjainen kulkumuototunnistusjärjestelmä, joka perustuu eri kulkumuotojen erotteluun näiden kiihtyvyysprofiilien perusteella. Väitöskirjan toinen osa keskittyy tapoihin mitata ja analysoida julkisten tilojen käyttöä liikkeentunnistuksen avulla. Työssä esitellään menetelmä, jolla kohdealueen käyttöä voidaan arvioida yhdistelemällä suoraa havainnointia ja mobiililaitteilla suoritettua havainnointia. Tämän esitellyn ihmisjoukkojen toisintamiseen (crowd replication) perustuvan menetelmän keskeisin tekninen kontribuutio on liikeantureihin perustuva liikkeenmallinnusmenetelmä, joka mahdollistaa käyttäjän tarkan askelten ja kävelyrytmin tunnistamisen. Yhdistämällä liikemallin tuottama tieto paikannusmenetelmään ja tutkijan omiin havaintoihin väitöskirjassa osoitetaan, kuinka käyttäjän osalta saadaan tallennettua tarkat tiedot hänen aktiviteeteistään ja liikeradoistaan sekä tilan että ajan suhteen. Väitöskirjan kolmannessa ja viimeisessä osassa esitellään kaksi esimerkkisovellusta liikkeentunnistuksen käytöstä mobiililaitteissa. Ensimmäinen näistä sovelluksista pyrkii edistämään ja tukemaan käyttäjää kohti kestäviä liikkumistapoja. Sovelluksen keskeisenä komponenttina toimii automaattinen kulkumuototunnistus, joka seuraa käyttäjän liikkumistottumuksia ja näistä koituvaa hiilidioksidijalanjälkeä. Toinen esiteltävä sovellus on mobiililaitepohjainen, yhteisöllinen puheentunnistus, jossa liikkeentunnistusta käytetään seuraamaan mobiililaiteryhmän fyysisen kokoonpanon pysyvyyttä
    corecore