132 research outputs found

    Classifying Imbalanced Multi-modal Sensor Data for Human Activity Recognition in a Smart Home using Deep Learning

    Get PDF
    In smart homes, data generated from real-time sensors for human activity recognition is complex, noisy and imbalanced. It is a significant challenge to create machine learning models that can classify activities which are not as commonly occurring as other activities. Machine learning models designed to classify imbalanced data are biased towards learning the more commonly occurring classes. Such learning bias occurs naturally, since the models better learn classes which contain more records. This paper examines whether fusing real-world imbalanced multi-modal sensor data improves classification results as opposed to using unimodal data; and compares deep learning approaches to dealing with imbalanced multi-modal sensor data when using various resampling methods and deep learning models. Experiments were carried out using a large multi-modal sensor dataset generated from the Sensor Platform for HEalthcare in a Residential Environment (SPHERE). The data comprises 16104 samples, where each sample comprises 5608 features and belongs to one of 20 activities (classes). Experimental results using SPHERE demonstrate the challenges of dealing with imbalanced multi-modal data and highlight the importance of having a suitable number of samples within each class for sufficiently training and testing deep learning models. Furthermore, the results revealed that when fusing the data and using the Synthetic Minority Oversampling Technique (SMOTE) to correct class imbalance, CNN-LSTM achieved the highest classification accuracy of 93.67% followed by CNN, 93.55%, and LSTM, i.e. 92.98%

    Multi-sensor fusion based on multiple classifier systems for human activity identification

    Get PDF
    Multimodal sensors in healthcare applications have been increasingly researched because it facilitates automatic and comprehensive monitoring of human behaviors, high-intensity sports management, energy expenditure estimation, and postural detection. Recent studies have shown the importance of multi-sensor fusion to achieve robustness, high-performance generalization, provide diversity and tackle challenging issue that maybe difficult with single sensor values. The aim of this study is to propose an innovative multi-sensor fusion framework to improve human activity detection performances and reduce misrecognition rate. The study proposes a multi-view ensemble algorithm to integrate predicted values of different motion sensors. To this end, computationally efficient classification algorithms such as decision tree, logistic regression and k-Nearest Neighbors were used to implement diverse, flexible and dynamic human activity detection systems. To provide compact feature vector representation, we studied hybrid bio-inspired evolutionary search algorithm and correlation-based feature selection method and evaluate their impact on extracted feature vectors from individual sensor modality. Furthermore, we utilized Synthetic Over-sampling minority Techniques (SMOTE) algorithm to reduce the impact of class imbalance and improve performance results. With the above methods, this paper provides unified framework to resolve major challenges in human activity identification. The performance results obtained using two publicly available datasets showed significant improvement over baseline methods in the detection of specific activity details and reduced error rate. The performance results of our evaluation showed 3% to 24% improvement in accuracy, recall, precision, F-measure and detection ability (AUC) compared to single sensors and feature-level fusion. The benefit of the proposed multi-sensor fusion is the ability to utilize distinct feature characteristics of individual sensor and multiple classifier systems to improve recognition accuracy. In addition, the study suggests a promising potential of hybrid feature selection approach, diversity-based multiple classifier systems to improve mobile and wearable sensor-based human activity detection and health monitoring system. - 2019, The Author(s).This research is supported by University of Malaya BKP Special Grant no vote BKS006-2018.Scopu

    Improving Network-Based Anomaly Detection in Smart Home Environment

    Get PDF
    The Smart Home (SH) has become an appealing target of cyberattacks. Due to the limitation of hardware resources and the various operating systems (OS) of current SH devices, existing security features cannot protect such an environment. Generally, the traffic patterns of an SH IoT device under attack often changes in the Home Area Network (HAN). Therefore, a Network-Based Intrusion Detection System (NIDS) logically becomes the forefront security solution for the SH. In this paper, we propose a novel method to assist classification machine learning algorithms generate an anomaly-based NIDS detection model, hence, detecting the abnormal SH IoT device network behaviour. Three network-based attacks were used to evaluate our NIDS solution in a simulated SH test-bed environment. The detection model generated by traditional and ensemble classification Mechanical Learning (ML) methods shows outstanding overall performance. The accuracy of all detection models is over 98.8%

    A review on classification of imbalanced data for wireless sensor networks

    Get PDF
    © The Author(s) 2020. Classification of imbalanced data is a vastly explored issue of the last and present decade and still keeps the same importance because data are an essential term today and it becomes crucial when data are distributed into several classes. The term imbalance refers to uneven distribution of data into classes that severely affects the performance of traditional classifiers, that is, classifiers become biased toward the class having larger amount of data. The data generated from wireless sensor networks will have several imbalances. This review article is a decent analysis of imbalance issue for wireless sensor networks and other application domains, which will help the community to understand WHAT, WHY, and WHEN of imbalance in data and its remedies

    A systematic review of data quality issues in knowledge discovery tasks

    Get PDF
    Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

    Mitigating Anomalous Electricity Consumption in Smart Cities Using an AI-Based Stacked-Generalization Technique

    Get PDF
    Energy management and efficient asset utilization play an important role in the economic development of a country. The electricity produced at the power station faces two types of losses from the generation point to the end user. These losses are technical losses (TL) and non-technical losses (NTL). TLs occurs due to the use of inefficient equipment. While NTLs occur due to the anomalous consumption of electricity by the customers, which happens in many ways; energy theft being one of them. Energy theft majorly happens to cut down on the electricity bills. These losses in the smart grid (SG) are the main issue in maintaining grid stability and cause revenue loss to the utility. The automatic metering infrastructure (AMI) system has reduced grid instability but it has opened up new ways for NTLs in the form of different cyber-physical theft attacks (CPTA). Machine learning (ML) techniques can be used to detect and minimize CPTA. However, they have certain limitations and cannot capture the energy consumption patterns (ECPs) of all the users, which decreases the performance of ML techniques in detecting malicious users. In this paper, we propose a novel ML-based stacked generalization method for the cyber-physical theft issue in the smart grid. The original data obtained from the grid is preprocessed to improve model training and processing. This includes NaN-imputation, normalization, outliers\u27 capping, support vector machine-synthetic minority oversampling technique (SVM-SMOTE) balancing, and principal component analysis (PCA) based data reduction techniques. The pre-processed dataset is provided to the ML models light gradient boosting (LGB), extra trees (ET), extreme gradient boosting (XGBoost), and random forest (RF), to accurately capture all consumers\u27 overall ECP. The predictions from these base models are fed to a meta-classifier multi-layer perceptron (MLP). The MLP combines the learning capability of all the base models and gives an improved final prediction. The proposed structure is implemented and verified on the publicly available real-time large dataset of the State Grid Corporation of China (SGCC). The proposed model outperformed the individual base classifiers and the existing research in terms of CPTA detection with false positive rate (FPR), false negative rate (FNR), F1-score, and accuracy values of 0.72%, 2.05%, 97.6%, and 97.69%, respectively

    Smart aging : utilisation of machine learning and the Internet of Things for independent living

    Get PDF
    Smart aging utilises innovative approaches and technology to improve older adults’ quality of life, increasing their prospects of living independently. One of the major concerns the older adults to live independently is “serious fall”, as almost a third of people aged over 65 having a fall each year. Dementia, affecting nearly 9% of the same age group, poses another significant issue that needs to be identified as early as possible. Existing fall detection systems from the wearable sensors generate many false alarms; hence, a more accurate and secure system is necessary. Furthermore, there is a considerable gap to identify the onset of cognitive impairment using remote monitoring for self-assisted seniors living in their residences. Applying biometric security improves older adults’ confidence in using IoT and makes it easier for them to benefit from smart aging. Several publicly available datasets are pre-processed to extract distinctive features to address fall detection shortcomings, identify the onset of dementia system, and enable biometric security to wearable sensors. These key features are used with novel machine learning algorithms to train models for the fall detection system, identifying the onset of dementia system, and biometric authentication system. Applying a quantitative approach, these models are tested and analysed from the test dataset. The fall detection approach proposed in this work, in multimodal mode, can achieve an accuracy of 99% to detect a fall. Additionally, using 13 selected features, a system for detecting early signs of dementia is developed. This system has achieved an accuracy rate of 93% to identify a cognitive decline in the older adult, using only some selected aspects of their daily activities. Furthermore, the ML-based biometric authentication system uses physiological signals, such as ECG and Photoplethysmogram, in a fusion mode to identify and authenticate a person, resulting in enhancement of their privacy and security in a smart aging environment. The benefits offered by the fall detection system, early detection and identifying the signs of dementia, and the biometric authentication system, can improve the quality of life for the seniors who prefer to live independently or by themselves

    Application of advanced machine learning techniques to early network traffic classification

    Get PDF
    The fast-paced evolution of the Internet is drawing a complex context which imposes demanding requirements to assure end-to-end Quality of Service. The development of advanced intelligent approaches in networking is envisioning features that include autonomous resource allocation, fast reaction against unexpected network events and so on. Internet Network Traffic Classification constitutes a crucial source of information for Network Management, being decisive in assisting the emerging network control paradigms. Monitoring traffic flowing through network devices support tasks such as: network orchestration, traffic prioritization, network arbitration and cyberthreats detection, amongst others. The traditional traffic classifiers became obsolete owing to the rapid Internet evolution. Port-based classifiers suffer from significant accuracy losses due to port masking, meanwhile Deep Packet Inspection approaches have severe user-privacy limitations. The advent of Machine Learning has propelled the application of advanced algorithms in diverse research areas, and some learning approaches have proved as an interesting alternative to the classic traffic classification approaches. Addressing Network Traffic Classification from a Machine Learning perspective implies numerous challenges demanding research efforts to achieve feasible classifiers. In this dissertation, we endeavor to formulate and solve important research questions in Machine-Learning-based Network Traffic Classification. As a result of numerous experiments, the knowledge provided in this research constitutes an engaging case of study in which network traffic data from two different environments are successfully collected, processed and modeled. Firstly, we approached the Feature Extraction and Selection processes providing our own contributions. A Feature Extractor was designed to create Machine-Learning ready datasets from real traffic data, and a Feature Selection Filter based on fast correlation is proposed and tested in several classification datasets. Then, the original Network Traffic Classification datasets are reduced using our Selection Filter to provide efficient classification models. Many classification models based on CART Decision Trees were analyzed exhibiting excellent outcomes in identifying various Internet applications. The experiments presented in this research comprise a comparison amongst ensemble learning schemes, an exploratory study on Class Imbalance and solutions; and an analysis of IP-header predictors for early traffic classification. This thesis is presented in the form of compendium of JCR-indexed scientific manuscripts and, furthermore, one conference paper is included. In the present work we study a wide number of learning approaches employing the most advance methodology in Machine Learning. As a result, we identify the strengths and weaknesses of these algorithms, providing our own solutions to overcome the observed limitations. Shortly, this thesis proves that Machine Learning offers interesting advanced techniques that open prominent prospects in Internet Network Traffic Classification.Departamento de Teoría de la Señal y Comunicaciones e Ingeniería TelemáticaDoctorado en Tecnologías de la Información y las Telecomunicacione
    corecore