4 research outputs found

    Effective Features and Machine Learning Methods for Document Classification

    Get PDF
    Document classification has been involved in a variety of applications, such as phishing and fraud detection, news categorisation, and information retrieval. This thesis aims to provide novel solutions to several important problems presented by document classification. First, an improved Principal Components Analysis (PCA), based on similarity and correlation criteria instead of covariance, is proposed, which aims to capture low-dimensional feature subset that facilitates improved performance in text classification. The experimental results have demonstrated the advantages and usefulness of the proposed method for text classification in high-dimensional feature space in terms of the number of features required to achieve the best classification accuracy. Second, two hybrid feature-subset selection methods are proposed based on the combination (via either union or intersection) of the results of both supervised (in one method) and unsupervised (in the other method) filter approaches prior to the use of a wrapper, leading to low-dimensional feature subset that can achieve both high classification accuracy and good interpretability, and spend less processing time than most current methods. The experimental results have demonstrated the effectiveness of the proposed methods for feature subset selection in high-dimensional feature space in terms of the number of selected features and the processing time spent to achieve the best classification accuracy. Third, a class-specific (supervised) pre-trained approach based on a sparse autoencoder is proposed for acquiring low-dimensional interesting structure of relevant features, which can be used for high-performance document classification. The experimental results have demonstrated the merit of this proposed method for document classification in high-dimensional feature space, in terms of the limited number of features required to achieve good classification accuracy. Finally, deep classifier structures associated with a stacked autoencoder (SAE) for higher-level feature extraction are investigated, aiming to overcome the difficulties experienced in training deep neural networks with limited training data in high-dimensional feature space, such as overfitting and vanishing/exploding gradients. This investigation has resulted in a three-stage learning algorithm for training deep neural networks. In comparison with support vector machines (SVMs) combined with SAE and Deep Multilayer Perceptron (DMLP) with random weight initialisation, the experimental results have shown the advantages and effectiveness of the proposed three-stage learning algorithm

    Feature Papers of Drones - Volume II

    Get PDF
    [EN] The present book is divided into two volumes (Volume I: articles 1–23, and Volume II: articles 24–54) which compile the articles and communications submitted to the Topical Collection ”Feature Papers of Drones” during the years 2020 to 2022 describing novel or new cutting-edge designs, developments, and/or applications of unmanned vehicles (drones). Articles 24–41 are focused on drone applications, but emphasize two types: firstly, those related to agriculture and forestry (articles 24–35) where the number of applications of drones dominates all other possible applications. These articles review the latest research and future directions for precision agriculture, vegetation monitoring, change monitoring, forestry management, and forest fires. Secondly, articles 36–41 addresses the water and marine application of drones for ecological and conservation-related applications with emphasis on the monitoring of water resources and habitat monitoring. Finally, articles 42–54 looks at just a few of the huge variety of potential applications of civil drones from different points of view, including the following: the social acceptance of drone operations in urban areas or their influential factors; 3D reconstruction applications; sensor technologies to either improve the performance of existing applications or to open up new working areas; and machine and deep learning development
    corecore