51 research outputs found

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Comparison of feature engineering methods and classifiers for recognizing physical activity types in older adults using real-life IMU and GPS data

    Full text link
    Physical Activities (PA) are crucial for human beings to stay healthy both physically and mentally. The physical activities of older adults show different characteristics than that of other age groups, such as lighter intensities and lower speeds. The MOASIS data is large-scale real-life mobility data collected from older adults in Switzerland. In this paper, IMU and GPS dimensions of MOASIS data are used to study the physical activity classification of the older population in real-life conditions. This paper focuses on feature engineering for machine learning methods, including feature calculation, feature extraction, and feature selection. First of all, this paper does a literature review of some of the papers under this theme, and summarizes the research gaps within this topic. The research gaps include: the application and comparison of dimension reduction and machine learning methods on such a real-life dataset focused on this specific age group, the application of GPS data for feature calculation in PA recognition, distinctive features extraction for PA types of older adults, the influence of validation methods on results of machine learning methods. Targeting the above research gaps, this paper puts forward three research questions: the comparison of different machine learning and dimension reduction methods, the comparison of the results of their application on this dataset, the impact of different dimensions of sensor data on the classification results. The results show that first, the most commonly used PCA feature extraction method can indeed improve the results of the KNN classifier in this data to a large extent, but it cannot help in improving the results of the unsupervised classifier Kmeans, which generally performs poorly in PA recognition. Second, Extra-tree performs best when considering the balance between time and accuracy among the classifiers compared. And the Recursive Feature Elimination method (RFECV) has the highest accuracy among the filter, wrapper and embedded feature selection methods based on the Extra-tree classifier. However, the differences in accuracy among the three methods are tiny. In addition, this paper concludes that the two validation methods compared (stratified k-fold validation and holdout validation) may affect the selection of hyper-parameters in model training. Finally, the feature importance ranking by different feature selection methods and the distinctive features for different PA types based on this dataset are also presented. For future studies, this paper suggests that more attention should be paid to the application of different sensor dimensions in PA recognition. Moreover, more fine hyperparameter adjustment of different models should be investigated

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Intelligent Data Mining using Kernel Functions and Information Criteria

    Get PDF
    Radial Basis Function (RBF) Neural Networks and Support Vector Machines (SVM) are two powerful kernel related intelligent data mining techniques. The current major problems with these methods are over-fitting and the existence of too many free parameters. The way to select the parameters can directly affect the generalization performance(test error) of theses models. Current practice in how to choose the model parameters is an art, rather than a science in this research area. Often, some parameters are predetermined, or randomly chosen. Other parameters are selected through repeated experiments that are time consuming, costly, and computationally very intensive. In this dissertation, we provide a two-stage analytical hybrid-training algorithm by building a bridge among regression tree, EM algorithm, and Radial Basis Function Neural Networks together. Information Complexity (ICOMP) criterion of Bozdogan along with other information based criteria are introduced and applied to control the model complexity, and to decide the optimal number of kernel functions. In the first stage of the hybrid, regression tree and EM algorithm are used to determine the kernel function parameters. In the second stage of the hybrid, the weights (coefficients) are calculated and information criteria are scored. Kernel Principal Component Analysis (KPCA) using EM algorithm for feature selection and data preprocessing is also introduced and studied. Adaptive Support Vector Machines (ASVM) and some efficient algorithms are given to deal with massive data sets in support vector classifications. Versatility and efficiency of the new proposed approaches are studied on real data sets and via Monte Carlo sim- ulation experiments

    BagStack Classification for Data Imbalance Problems with Application to Defect Detection and Labeling in Semiconductor Units

    Get PDF
    abstract: Despite the fact that machine learning supports the development of computer vision applications by shortening the development cycle, finding a general learning algorithm that solves a wide range of applications is still bounded by the ”no free lunch theorem”. The search for the right algorithm to solve a specific problem is driven by the problem itself, the data availability and many other requirements. Automated visual inspection (AVI) systems represent a major part of these challenging computer vision applications. They are gaining growing interest in the manufacturing industry to detect defective products and keep these from reaching customers. The process of defect detection and classification in semiconductor units is challenging due to different acceptable variations that the manufacturing process introduces. Other variations are also typically introduced when using optical inspection systems due to changes in lighting conditions and misalignment of the imaged units, which makes the defect detection process more challenging. In this thesis, a BagStack classification framework is proposed, which makes use of stacking and bagging concepts to handle both variance and bias errors. The classifier is designed to handle the data imbalance and overfitting problems by adaptively transforming the multi-class classification problem into multiple binary classification problems, applying a bagging approach to train a set of base learners for each specific problem, adaptively specifying the number of base learners assigned to each problem, adaptively specifying the number of samples to use from each class, applying a novel data-imbalance aware cross-validation technique to generate the meta-data while taking into account the data imbalance problem at the meta-data level and, finally, using a multi-response random forest regression classifier as a meta-classifier. The BagStack classifier makes use of multiple features to solve the defect classification problem. In order to detect defects, a locally adaptive statistical background modeling is proposed. The proposed BagStack classifier outperforms state-of-the-art image classification techniques on our dataset in terms of overall classification accuracy and average per-class classification accuracy. The proposed detection method achieves high performance on the considered dataset in terms of recall and precision.Dissertation/ThesisDoctoral Dissertation Computer Engineering 201

    Expanding the theoretical framework of reservoir computing

    Get PDF

    A Multi-Tier Distributed fog-based Architecture for Early Prediction of Epileptic Seizures

    Get PDF
    Epilepsy is the fourth most common neurological problem. With 50 million people living with epilepsy worldwide, about one in 26 people will continue experiencing recurring seizures during their lifetime. Epileptic seizures are characterized by uncontrollable movements and can cause loss of awareness. Despite the optimal use of antiepileptic medications, seizures are still difficult to control due to their sudden and unpredictable nature. Such seizures can put the lives of patients and others at risk. For example, seizure attacks while patients are driving could affect their ability to control a vehicle and could result in injuries to the patients as well as others. Notifying patients before the onset of seizures can enable them to avoid risks and minimize accidents, thus, save their lives. Early and accurate prediction of seizures can play a significant role in improving patients’ quality of life and helping doctors to administer medications through providing a historical overview of patient's condition over time. The individual variability and the dynamic disparity in differentiating between the pre-ictal phase (a period before the onset of the seizure) and other seizures phases make the early prediction of seizures a challenging task. Although several research projects have focused on developing a reliable seizure prediction model, numerous challenges still exist and need to be addressed. Most of the existing approaches are not suitable for real-time settings, which requires bio-signals collection and analysis in real-time. Various methods were developed based on the analysis of EEG signals without considering the notification latency and computational cost to support monitoring of multiple patients. Limited approaches were designed based on the analysis of ECG signals. ECG signals can be collected using consumer wearable devices and are suitable for light-weight real-time analysis. Moreover, existing prediction methods were developed based on the analysis of seizure state and ignored the investigation of pre-ictal state. The analysis of the pre-ictal state is essential in the prediction of seizures at an early stage. Therefore, there is a crucial need to design a novel computing model for early prediction of epileptic seizures. This model would greatly assist in improving the patients' quality of lives. This work proposes a multi-tier architecture for early prediction of seizures based on the analysis of two vital signs, namely, Electrocardiography (ECG) and Electroencephalogram (EEG) signals. The proposed architecture comprises of three tiers: (1) sensing at the first tier, (2) lightweight analysis based on ECG signals at the second tier, and (3) deep analysis based on EEG signals at the third tier. The proposed architecture is developed to leverage the potential of fog computing technology at the second tier for a real-time signal analytics and ubiquitous response. The proposed architecture can enable the early prediction of epileptic seizures, reduce the notification latency, and minimize the energy consumption on real-time data transmissions. Moreover, the proposed architecture is designed to allow for both lightweight and extensive analytics, thus make accurate and reliable decisions. The proposed lightweight model is formulated using the analysis of ECG signals to detect the pre-ictal state. The lightweight model utilizes the Least Squares Support Vector Machines (LS-SVM) classifier, while the proposed extensive analytics model analyzes EEG signals and utilizes Deep Belief Network (DBN) to provide an accurate classification of the patient’s state. The performance of the proposed architecture is evaluated in terms of latency minimization and energy consumption in comparison with the cloud. Moreover, the performance of the proposed prediction models is evaluated using three datasets. Various performance metrics were used to investigate the prediction model performance, including: accuracy, sensitivity, specificity, and F1-Measure. The results illustrate the merits of the proposed architecture and show significant improvement in the early prediction of seizures in terms of accuracy, sensitivity, and specificity
    • …
    corecore