3 research outputs found

    Integration of feature subset selection methods for sentiment analysis

    Get PDF
    Feature selection is one of the main challenges in sentiment analysis to find an optimal feature subset from a real-world domain. The complexity of an optimal feature subset selection grows exponentially based on the number of features for analysing and organizing data in high-dimensional spaces that lead to the high-dimensional problems. To overcome the problem, this study attempted to enhance the feature subset selection in high-dimensional data by removing irrelevant and redundant features using filter and wrapper approaches. Initially, a filter method based on dispersion of samples on feature space known as mutual standard deviation method was developed to minimize intra-class and maximize inter-class distances. The filter-based methods have some advantages such as they are easily scaled to high-dimensional datasets and are computationally simple and fast. Besides, they only depend on feature selection space and ignore the hypothesis model space. Hence, the next step of this study developed a new feature ranking approach by integrating various filter methods. The ordinal-based and frequency-based integration of different filter methods were developed. Finally, a hybrid harmony search based on search strategy was developed and used to enhance the feature subset selection to overcome the problem of ignoring the dependency of feature selection on the classifier. Therefore, a search strategy on feature space using integration of filter and wrapper approaches was introduced to find a semantic relationship among the model selections and subsets of the search features. Comparative experiments were performed on five sentiment datasets, namely movie, music, book, electronics, and kitchen review dataset. A sizeable performance improvement was noted whereby the proposed integration-based feature subset selection method yielded a result of 98.32% accuracy in sentiment classification using POS-based features on movie reviews. Finally, a statistical test conducted based on the accuracy showed significant differences between the proposed methods and the baseline methods in almost all the comparisons in k-fold cross-validation. The findings of the study have shown the effectiveness of the mutual standard deviation and integration-based feature subset selection methods have outperformed the other baseline methods in terms of accuracy

    Application of variational mode decomposition in vibration analysis of machine components

    Get PDF
    Monitoring and diagnosis of machinery in maintenance are often undertaken using vibration analysis. The machine vibration signal is invariably complex and diverse, and thus useful information and features are difficult to extract. Variational mode decomposition (VMD) is a recent signal processing method that able to extract some of important features from machine vibration signal. The performance of the VMD method depends on the selection of its input parameters, especially the mode number and balancing parameter (also known as quadratic penalty term). However, the current VMD method is still using a manual effort to extract the input parameters where it subjects to interpretation of experienced experts. Hence, machine diagnosis becomes time consuming and prone to error. The aim of this research was to propose an automated parameter selection method for selecting the VMD input parameters. The proposed method consisted of two-stage selections where the first stage selection was used to select the initial mode number and the second stage selection was used to select the optimized mode number and balancing parameter. A new machine diagnosis approach was developed, named as VMD Differential Evolution Algorithm (VMDEA)-Extreme Learning Machine (ELM). Vibration signal datasets were then reconstructed using VMDEA and the multi-domain features consisted of time-domain, frequency-domain and multi-scale fuzzy entropy were extracted. It was demonstrated that the VMDEA method was able to reduce the computational time about 14% to 53% as compared to VMD-Genetic Algorithm (GA), VMD-Particle Swarm Optimization (PSO) and VMD-Differential Evolution (DE) approaches for bearing, shaft and gear. It also exhibited a better convergence with about two to nine less iterations as compared to VMD-GA, VMD-PSO and VMD-DE for bearing, shaft and gear. The VMDEA-ELM was able to illustrate higher classification accuracy about 11% to 20% than Empirical Mode Decomposition (EMD)-ELM, Ensemble EMD (EEMD)-ELM and Complimentary EEMD (CEEMD)-ELM for bearing shaft and gear. The bearing datasets from Case Western Reserve University were tested with VMDEA-ELM model and compared with Support Vector Machine (SVM)-Dempster-Shafer (DS), EEMD Optimal Mode Multi-scale Fuzzy Entropy Fault Diagnosis (EOMSMFD), Wavelet Packet Transform (WPT)-Local Characteristic-scale Decomposition (LCD)- ELM, and Arctangent S-shaped PSO least square support vector machine (ATSWPLM) models in term of its classification accuracy. The VMDEA-ELM model demonstrates better diagnosis accuracy with small differences between 2% to 4% as compared to EOMSMFD and WPT-LCD-ELM but less diagnosis accuracy in the range of 4% to 5% as compared to SVM-DS and ATSWPLM. The diagnosis approach VMDEA-ELM was also able to provide faster classification performance about 6 40 times faster than Back Propagation Neural Network (BPNN) and Support Vector Machine (SVM). This study provides an improved solution in determining an optimized VMD parameters by using VMDEA. It also demonstrates a more accurate and effective diagnostic approach for machine maintenance using VMDEA-ELM
    corecore