835,387 research outputs found

    Towards Clustering of Mobile and Smartwatch Accelerometer Data for Physical Activity Recognition

    Get PDF
    Mobile and wearable devices now have a greater capability of sensing human activity ubiquitously and unobtrusively through advancements in miniaturization and sensing abilities. However, outstanding issues remain around the energy restrictions of these devices when processing large sets of data. This paper presents our approach that uses feature selection to refine the clustering of accelerometer data to detect physical activity. This also has a positive effect on the computational burden that is associated with processing large sets of data, as energy efficiency and resource use is decreased because less data is processed by the clustering algorithms. Raw accelerometer data, obtained from smartphones and smartwatches, have been preprocessed to extract both time and frequency domain features. Principle component analysis feature selection (PCAFS) and correlation feature selection (CFS) have been used to remove redundant features. The reduced feature sets have then been evaluated against three widely used clustering algorithms, including hierarchical clustering analysis (HCA), k-means, and density-based spatial clustering of applications with noise (DBSCAN). Using the reduced feature sets resulted in improved separability, reduced uncertainty, and improved efficiency compared with the baseline, which utilized all features. Overall, the CFS approach in conjunction with HCA produced higher Dunn Index results of 9.7001 for the phone and 5.1438 for the watch features, which is an improvement over the baseline. The results of this comparative study of feature selection and clustering, with the specific algorithms used, has not been performed previously and provides an optimistic and usable approach to recognize activities using either a smartphone or smartwatch

    Chronic pain detection from resting-state raw EEG signals using improved feature selection

    Full text link
    We present an automatic approach that works on resting-state raw EEG data for chronic pain detection. A new feature selection algorithm - modified Sequential Floating Forward Selection (mSFFS) - is proposed. The improved feature selection scheme is rather compact but displays better class separability as indicated by the Bhattacharyya distance measures and better visualization results. It also outperforms selections generated by other benchmark methods, boosting the test accuracy to 97.5% and yielding a test accuracy of 81.4% on an external dataset that contains different types of chronic painComment: 9 pages, 4 figures, journal submissio

    A Classification Model for Sensing Human Trust in Machines Using EEG and GSR

    Full text link
    Today, intelligent machines \emph{interact and collaborate} with humans in a way that demands a greater level of trust between human and machine. A first step towards building intelligent machines that are capable of building and maintaining trust with humans is the design of a sensor that will enable machines to estimate human trust level in real-time. In this paper, two approaches for developing classifier-based empirical trust sensor models are presented that specifically use electroencephalography (EEG) and galvanic skin response (GSR) measurements. Human subject data collected from 45 participants is used for feature extraction, feature selection, classifier training, and model validation. The first approach considers a general set of psychophysiological features across all participants as the input variables and trains a classifier-based model for each participant, resulting in a trust sensor model based on the general feature set (i.e., a "general trust sensor model"). The second approach considers a customized feature set for each individual and trains a classifier-based model using that feature set, resulting in improved mean accuracy but at the expense of an increase in training time. This work represents the first use of real-time psychophysiological measurements for the development of a human trust sensor. Implications of the work, in the context of trust management algorithm design for intelligent machines, are also discussed.Comment: 20 page

    CREDIT SCORING USING LOGISTIC REGRESSION

    Get PDF
    This report presents an approach to predict the credit scores of customers using the Logistic Regression machine learning algorithm. The research objective of this project is to perform a comparative study between feature selection and feature extraction, against the same dataset using the Logistic Regression machine learning algorithm. For feature selection, we have used Stepwise Logistic Regression. For feature extraction, we have used Singular Value Decomposition (SVD) and Weighted Singular Value Decomposition (SVD). In order to test the accuracy obtained using feature selection and feature extraction, we used a public credit dataset having 11 features and 150,000 records. After performing feature reduction, Logistic Regression algorithm was used for classification. In our results, we observed that Stepwise Logistic Regression gave a 14% increase in accuracy as compared to Singular Value Decomposition (SVD) and a 10% increase in accuracy as compared to Weighted Singular Value Decomposition (SVD). Thus, we can conclude that Stepwise Logistic Regression performed significantly better than both Singular Value Decomposition (SVD) and Weighted Singular Value Decomposition (SVD). The benefit of using feature selection was that it helped us in identifying important features, which improved the prediction accuracy of the classifier

    Multi-Criterion Mammographic Risk Analysis Supported with Multi-Label Fuzzy-Rough Feature Selection

    Get PDF
    Context and background Breast cancer is one of the most common diseases threatening the human lives globally, requiring effective and early risk analysis for which learning classifiers supported with automated feature selection offer a potential robust solution. Motivation Computer aided risk analysis of breast cancer typically works with a set of extracted mammographic features which may contain significant redundancy and noise, thereby requiring technical developments to improve runtime performance in both computational efficiency and classification accuracy. Hypothesis Use of advanced feature selection methods based on multiple diagnosis criteria may lead to improved results for mammographic risk analysis. Methods An approach for multi-criterion based mammographic risk analysis is proposed, by adapting the recently developed multi-label fuzzy-rough feature selection mechanism. Results A system for multi-criterion mammographic risk analysis is implemented with the aid of multi-label fuzzy-rough feature selection and its performance is positively verified experimentally, in comparison with representative popular mechanisms. Conclusions The novel approach for mammographic risk analysis based on multiple criteria helps improve classification accuracy using selected informative features, without suffering from the redundancy caused by such complex criteria, with the implemented system demonstrating practical efficacy

    Application of feature selection for predicting leaf chlorophyll content in oats (Avena sativa L.) from hyperspectral imagery

    Get PDF
    Feature selection can improve predictions generated by partial least squares models. In the context of hyperspectral imaging, it can also enable the development of affordable devices with specialized applications. The feasibility of feature selection for oat leaf chlorophyll estimation from hyperspectral imagery was assessed using a public domain dataset. A wrapper approach resulted in a simplistic model with poor predictive performance. The number of model inputs decreased from 94 to 3 bands when a filter approach based on the minimum redundancy, maximum relevance criterion was attempted. The filtering led to improved prediction quality, with the root mean square error decreasing from 0.17 to 0.16 g m-2 and R 2 increasing from 0.57 to 0.62. Accurate predictions were obtained especially for low chlorophyll levels. The obtained model estimated leaf chlorophyll concentration from near infra-red reflectance, canopy darkness, and its blueness. The prediction robustness needs to be investigated, which can be done by employing an ensemble methodology and testing the model on a new dataset with improved ground-truth measurements and additional crop species

    Finding Fuzzy-rough Reducts with Fuzzy Entropy

    Get PDF
    Abstract—Dataset dimensionality is undoubtedly the single most significant obstacle which exasperates any attempt to apply effective computational intelligence techniques to problem domains. In order to address this problem a technique which re-duces dimensionality is employed prior to the application of any classification learning. Such feature selection (FS) techniques attempt to select a subset of the original features of a dataset which are rich in the most useful information. The benefits can include improved data visualisation and transparency, a reduction in training and utilisation times and potentially, im-proved prediction performance. Methods based on fuzzy-rough set theory have demonstrated this with much success. Such methods have employed the dependency function which is based on the information contained in the lower approximation as an evaluation step in the FS process. This paper presents three novel feature selection techniques employing fuzzy entropy to locate fuzzy-rough reducts. This approach is compared with two other fuzzy-rough feature selection approaches which utilise other measures for the selection of subsets. I
    • …
    corecore