4 research outputs found

    Automatic Complaint Classification System Using Classifier Ensembles

    Get PDF
    Sambat Online is an online complaint system run by the city government of Malang, Indonesia. Because most citizens do not know to which work units (Satuan Kerja Pemerintah Daerah [SKPDs]) their complaints should be sent, the system administrator must manually sort and classify all of the incoming complaints with respect to the appropriate SKPDs. This study empirically evaluated the application of an automated system to replace the manual classification process. The experiments, which used Sambat Online data, involved five individual classification algorithms— Naïve Bayes, Maximum Entropy, K-Nearest Neighbors, Random Forest, and Support Vector Machines—and two ensemble strategies—hard voting and soft voting. The results show that the Multinomial Naïve Bayes classifier achieved the best performance, an 80.7% accuracy value, of the five individual classifiers. The results also indicate that generally all of the ensemble methods performed better than the individual classifiers. Almost all of them had the same accuracy level of 81.2%. In addition, the soft voting strategy had slightly higher accuracy than the hard one when all five classifiers were used. However, when the three best classifier combinations were used, both had the same level of accuracy

    Effective Features and Machine Learning Methods for Document Classification

    Get PDF
    Document classification has been involved in a variety of applications, such as phishing and fraud detection, news categorisation, and information retrieval. This thesis aims to provide novel solutions to several important problems presented by document classification. First, an improved Principal Components Analysis (PCA), based on similarity and correlation criteria instead of covariance, is proposed, which aims to capture low-dimensional feature subset that facilitates improved performance in text classification. The experimental results have demonstrated the advantages and usefulness of the proposed method for text classification in high-dimensional feature space in terms of the number of features required to achieve the best classification accuracy. Second, two hybrid feature-subset selection methods are proposed based on the combination (via either union or intersection) of the results of both supervised (in one method) and unsupervised (in the other method) filter approaches prior to the use of a wrapper, leading to low-dimensional feature subset that can achieve both high classification accuracy and good interpretability, and spend less processing time than most current methods. The experimental results have demonstrated the effectiveness of the proposed methods for feature subset selection in high-dimensional feature space in terms of the number of selected features and the processing time spent to achieve the best classification accuracy. Third, a class-specific (supervised) pre-trained approach based on a sparse autoencoder is proposed for acquiring low-dimensional interesting structure of relevant features, which can be used for high-performance document classification. The experimental results have demonstrated the merit of this proposed method for document classification in high-dimensional feature space, in terms of the limited number of features required to achieve good classification accuracy. Finally, deep classifier structures associated with a stacked autoencoder (SAE) for higher-level feature extraction are investigated, aiming to overcome the difficulties experienced in training deep neural networks with limited training data in high-dimensional feature space, such as overfitting and vanishing/exploding gradients. This investigation has resulted in a three-stage learning algorithm for training deep neural networks. In comparison with support vector machines (SVMs) combined with SAE and Deep Multilayer Perceptron (DMLP) with random weight initialisation, the experimental results have shown the advantages and effectiveness of the proposed three-stage learning algorithm

    pHealth 2021. Proc. of the 18th Internat. Conf. on Wearable Micro and Nano Technologies for Personalised Health, 8-10 November 2021, Genoa, Italy

    Get PDF
    Smart mobile systems – microsystems, smart textiles, smart implants, sensor-controlled medical devices – together with related body, local and wide-area networks up to cloud services, have become important enablers for telemedicine and the next generation of healthcare services. The multilateral benefits of pHealth technologies offer enormous potential for all stakeholder communities, not only in terms of improvements in medical quality and industrial competitiveness, but also for the management of healthcare costs and, last but not least, the improvement of patient experience. This book presents the proceedings of pHealth 2021, the 18th in a series of conferences on wearable micro and nano technologies for personalized health with personal health management systems, hosted by the University of Genoa, Italy, and held as an online event from 8 – 10 November 2021. The conference focused on digital health ecosystems in the transformation of healthcare towards personalized, participative, preventive, predictive precision medicine (5P medicine). The book contains 46 peer-reviewed papers (1 keynote, 5 invited papers, 33 full papers, and 7 poster papers). Subjects covered include the deployment of mobile technologies, micro-nano-bio smart systems, bio-data management and analytics, autonomous and intelligent systems, the Health Internet of Things (HIoT), as well as potential risks for security and privacy, and the motivation and empowerment of patients in care processes. Providing an overview of current advances in personalized health and health management, the book will be of interest to all those working in the field of healthcare today
    corecore