69 research outputs found

    Impacts of Data Synthesis: A Metric for Quantifiable Data Standards and Performances

    Get PDF
    Clinical data analysis could lead to breakthroughs. However, clinical data contain sensitive information about participants that could be utilized for unethical activities, such as blackmailing, identity theft, mass surveillance, or social engineering. Data anonymization is a standard step during data collection, before sharing, to overcome the risk of disclosure. However, conventional data anonymization techniques are not foolproof and also hinder the opportunity for personalized evaluations. Much research has been done for synthetic data generation using generative adversarial networks and many other machine learning methods; however, these methods are either not free to use or are limited in capacity. This study evaluates the performance of an emerging tool named synthpop, an R package producing synthetic data as an alternative approach for data anonymization. This paper establishes data standards derived from the original data set based on the utilities and quality of information and measures variations in the synthetic data set to evaluate the performance of the data synthesis process. The methods to assess the utility of the synthetic data set can be broadly divided into two approaches: general utility and specific utility. General utility assesses whether synthetic data have overall similarities in the statistical properties and multivariate relationships with the original data set. Simultaneously, the specific utility assesses the similarity of a fitted model’s performance on the synthetic data to its performance on the original data. The quality of information is assessed by comparing variations in entropy bits and mutual information to response variables within the original and synthetic data sets. The study reveals that synthetic data succeeded at all utility tests with a statistically non-significant difference and not only preserved the utilities but also preserved the complexity of the original data set according to the data standard established in this study. Therefore, synthpop fulfills all the necessities and unfolds a wide range of opportunities for the research community, including easy data sharing and information protection

    Incremental learning to personalize human activity recognition models:the importance of human AI collaboration

    No full text
    Abstract This study presents incremental learning based methods to personalize human activity recognition models. Initially, a user-independent model is used in the recognition process. When a new user starts to use the human activity recognition application, personal streaming data can be gathered. Of course, this data does not have labels. However, there are three different ways to obtain this data: non-supervised, semi-supervised, and supervised. The non-supervised approach relies purely on predicted labels, the supervised approach uses only human intelligence to label the data, and the proposed method for semi-supervised learning is a combination of these two: It uses artificial intelligence (AI) in most cases to label the data but in uncertain cases it relies on human intelligence. After labels are obtained, the personalization process continues by using the streaming data and these labels to update the incremental learning based model, which in this case is Learn++. Learn++ is an ensemble method that can use any classifier as a base classifier, and this study compares three base classifiers: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and classification and regression tree (CART). Moreover, three datasets are used in the experiment to show how well the presented method generalizes on different datasets. The results show that personalized models are much more accurate than user-independent models. On average, the recognition rates are: 87.0% using the user-independent model, 89.1% using the non-supervised personalization approach, 94.0% using the semi-supervised personalization approach, and 96.5% using the supervised personalization approach. This means that by relying on predicted labels with high confidence, and asking the user to label only uncertain observations (6.6% of the observations when using LDA, 7.7% when using QDA, and 18.3% using CART), almost as low error rates can be achieved as by using the supervised approach, in which labeling is fully based on human intelligence

    Framework for Dependable and Pervasive eHealth Services

    No full text
    Abstract Provision of health care and well-being services at end-user residence, together with its benefits, brings important concerns to be dealt with. This article discusses selected issues in dependable pervasive eHealth services support. Dependable services need to be implemented in a resourceefficient and safe way due to constrained and concurrent, preexisting conditions and radio environment. Security is a must when dealing with personal information, even more critical when regarding health. Once these fundamental requirements are satisfied, and services designed in an effective manner, social significance can be achieved in various scenarios. After having discussed the above viewpoints, the article concludes with the future directions in eHealth IoT including scaling the system down to the nanoscale, to interact more intimately with biological organisms

    Robotic inspection of oil and gas plants by hybrid unmanned vehicle and mobile ground support platform

    No full text
    Abstract Safety risks and high costs of human inspection of oil and gas plants drive towards the adoption of robotic inspection. The challenging cluttered inspection environment and the constraints dictated by legislation on potentially explosive atmospheres implying energy-efficient solutions suggest the use of an inspection-tool-equipped hybrid rolling-flying unmanned vehicle and of a mobile ground platform supporting the connected inspection robot. These two design choices together with their development are described in this article

    Context-aware incremental learning-based method for personalized human activity recognition

    No full text
    Abstract This study introduces an ensemble-based personalized human activity recognition method relying on incremental learning, which is a method for continuous learning, that can not only learn from streaming data but also adapt to different contexts and changes in context. This adaptation is based on a novel weighting approach which gives bigger weight to those base models of the ensemble which are the most suitable to the current context. In this article, contexts are different body positions for inertial sensors. The experiments are performed in two scenarios: (S1) adapting model to a known context, and (S2) adapting model to a previously unknown context. In both scenarios, the models had to also adapt to the data of previously unknown person, as the initial user-independent dataset did not include any data from the studied user. In the experiments, the proposed ensemble-based approach is compared to non-weighted personalization method relying on ensemble-based classifier and to static user-independent model. Both ensemble models are experimented using three different base classifiers (linear discriminant analysis, quadratic discriminant analysis, and classification and regression tree). The results show that the proposed ensemble method performs much better than non-weighted ensemble model for personalization in both scenarios no matter which base classifier is used. Moreover, the proposed method outperforms user-independent models. In scenario 1, the error rate of balanced accuracy using user-independent model was 13.3%, using non-weighted personalization method 13.8%, and using the proposed method 6.4%. The difference is even bigger in scenario 2, where the error rate using user-independent model is 36.6%, using non-weighted personalization method 36.9%, and using the proposed method 14.1%. In addition, F1 scores also show that the proposed method performs much better in both scenarios that the rival methods. Moreover, as a side result, it was noted that the presented method can also be used to recognize body position of the sensor

    Comparison of regression and classification models for user-independent and personal stress detection

    No full text
    Abstract In this article, regression and classification models are compared for stress detection. Both personal and user-independent models are experimented. The article is based on publicly open dataset called AffectiveROAD, which contains data gathered using Empatica E4 sensor and unlike most of the other stress detection datasets, it contains continuous target variables. The used classification model is Random Forest and the regression model is Bagged tree based ensemble. Based on experiments, regression models outperform classification models, when classifying observations as stressed or not-stressed. The best user-independent results are obtained using a combination of blood volume pulse and skin temperature features, and using these the average balanced accuracy was 74.1% with classification model and 82.3% using regression model. In addition, regression models can be used to estimate the level of the stress. Moreover, the results based on models trained using personal data are not encouraging showing that biosignals have a lot of variation not only between the study subjects but also between the session gathered from the same person. On the other hand, it is shown that with subject-wise feature selection for user-independent model, it is possible to improve recognition models more than by using personal training data to build personal models. In fact, it is shown that with subject-wise feature selection, the average detection rate can be improved as much as 4%-units, and it is especially useful to reduce the variance in the recognition rates between the study subjects

    Revisiting “Recognizing human activities user-independently on smartphones based on accelerometer data” – what has happened since 2012?

    No full text
    Abstract Our article “Recognizing human activities user-independently on smartphones based on accelerometer data” was published in the International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI) in 2012. In 2018, it was selected as the most outstanding article published in the 10 years of IJIMAI life. To celebrate the 10th anniversary of IJIMAI, in this article we will introduce what has happened in the field of human activity recognition and wearable sensor-based recognition since 2012, and especially, this article concentrates on introducing our work since 2012

    Digital fabrication in promoting student engagement and motivation in university courses

    No full text
    Abstract Project work represents a significant part of university studies, making them an important concern for teaching development. Many universities have used international engineering competitions such as the Eurobot robotics competition as a tool in engaging and motivating students. Based on a theory on why these competitions are successful, we propose how smaller-scale projects can use digital fabrication and joint projects to reach similar results. We present a case study of an ongoing robotics project course in the field of computer engineering, showing how a practical implementation of these ideas and how the field-specific problem of software intercommunication and interoperability can be solved using the Robotic Operating System software framework. While the course is still in progress, initial observations indicate that the course is going to be successful

    Experiences with publicly open human activity data sets:studying the generalizability of the recognition models

    No full text
    Abstract In this article, it is studied how well inertial sensor-based human activity recognition models work when training and testing data sets are collected in different environments. Comparison is done using publicly open human activity data sets. This article has four objectives. Firstly, survey about publicly available data sets is presented. Secondly, one previously not shared human activity data set used in our earlier work is opened for public use. Thirdly, the genaralizability of the recognition models trained using publicly open data sets are experimented by testing them with data from another publicly open data set to get knowledge to how models work when they are used in different environment, with different study subjects and hardware. Finally, the challenges encountered using publicly open data sets are discussed. The results show that data gathering protocol can have a statistically significant effect to the recognition rates. In addition, it was noted that often publicly open human activity data sets are not as easy to apply as they should be
    • …
    corecore