6 research outputs found

    Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

    Get PDF
    With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However, only a few methods are utilized for huge text classification problems. In this paper, we propose a new wrapper method based on Particle Swarm Optimization (PSO) algorithm and Support Vector Machine (SVM). We combine it with Learning Automata in order to make it more efficient. This helps to select better features using the reward and penalty system of automata. To evaluate the efficiency of the proposed method, we compare it with a method which selects features based on Genetic Algorithm over the Reuters-21578 dataset. The simulation results show that our proposed algorithm works more efficiently

    A Survey on Feature Selection Algorithms

    Get PDF
    One major component of machine learning is feature analysis which comprises of mainly two processes: feature selection and feature extraction. Due to its applications in several areas including data mining, soft computing and big data analysis, feature selection has got a reasonable importance. This paper presents an introductory concept of feature selection with various inherent approaches. The paper surveys historic developments reported in feature selection with supervised and unsupervised methods. The recent developments with the state of the art in the on-going feature selection algorithms have also been summarized in the paper including their hybridizations. DOI: 10.17762/ijritcc2321-8169.16043

    An Innovative Approach for Attribute Reduction Using Rough Sets and Flower Pollination Optimisation

    Get PDF
    Optimal search is a major challenge for wrapper-based attribute reduction. Rough sets have been used with much success, but current hill-climbing rough set approaches to attribute reduction are insufficient for finding optimal solutions. In this paper, we propose an innovative use of an intelligent optimisation method, namely the flower search algorithm (FSA), with rough sets for attribute reduction. FSA is a relatively recent computational intelligence algorithm, which is inspired by the pollination process of flowers. For many applications, the attribute space, besides being very large, is also rough with many different local minima which makes it difficult to converge towards an optimal solution. FSA can adaptively search the attribute space for optimal attribute combinations that maximise a given fitness function, with the fitness function used in our work being rough set-based classification. Experimental results on various benchmark datasets from the UCI repository confirm our technique to perform well in comparison with competing methods

    Sentiment Polarity Classification of Comments on Korean News Articles Using Feature Reweighting

    Get PDF
    ์ผ๋ฐ˜์ ์œผ๋กœ ์ธํ„ฐ๋„ท ์‹ ๋ฌธ ๊ธฐ์‚ฌ์— ๋Œ€ํ•œ ๋Œ“๊ธ€์€ ๊ทธ ์‹ ๋ฌธ ๊ธฐ์‚ฌ์— ๋Œ€ํ•œ ์ฃผ๊ด€์ ์ธ ๊ฐ์ •์ด๋‚˜ ์˜๊ฒฌ์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋Ÿฐ ์‹ ๋ฌธ ๊ธฐ์‚ฌ์˜ ๋Œ“๊ธ€์— ๋Œ€ํ•œ ๊ฐ์ •์„ ์ธ์‹ํ•˜๊ณ  ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฐ์—๋Š” ๊ทธ ์‹ ๋ฌธ ๊ธฐ์‚ฌ์˜ ์›๋ฌธ ๋‚ด์šฉ์ด ์ค‘์š”ํ•œ ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค. ์ด๋Ÿฐ ์ ์— ์ฐฉ์•ˆํ•˜์—ฌ ๋ณธ ๋…ผ๋ฌธ์€ ๊ธฐ์‚ฌ์˜ ์›๋ฌธ ๋‚ด์šฉ๊ณผ ๊ฐ์ • ์‚ฌ์ „์„ ์ด์šฉํ•˜๋Š” ๊ฐ€์ค‘์น˜ ์กฐ์ • ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๊ณ , ์ œ์•ˆ๋œ ๊ฐ€์ค‘์น˜ ์กฐ์ • ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•ด์„œ ํ•œ๊ตญ์–ด ์‹ ๋ฌธ ๊ธฐ์‚ฌ์˜ ๋Œ“๊ธ€์— ๋Œ€ํ•œ ๊ฐ์ • ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๊ฐ€์ค‘์น˜ ์กฐ์ • ๋ฐฉ๋ฒ•์—๋Š” ๋‹ค์–‘ํ•œ ์ž์งˆ ์ง‘ํ•ฉ์ด ์‚ฌ์šฉ๋˜๋Š”๋ฐ ๊ทธ๊ฒƒ์€ ๋Œ“๊ธ€์— ํฌํ•จ๋œ ๊ฐ์ • ๋‹จ์–ด, ๊ทธ๋ฆฌ๊ณ  ๊ฐ์ • ์‚ฌ์ „๊ณผ ๋‰ด์Šค ๊ธฐ์‚ฌ์˜ ๋ณธ๋ฌธ์— ๊ด€๋ จ๋œ ์ž์งˆ๋“ค, ๋งˆ์ง€๋ง‰์œผ๋กœ ๋‰ด์Šค ๊ธฐ์‚ฌ์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๋งํ•˜๋Š” ๊ฐ์ • ์‚ฌ์ „์€ ํ•œ๊ตญ์–ด ๊ฐ์ • ์‚ฌ์ „์„ ์˜๋ฏธํ•˜๋ฉฐ ์•„์ง ๊ณต๊ฐœ๋œ ๊ฒƒ์ด ์—†๊ธฐ ๋•Œ๋ฌธ์—, ๊ธฐ์กด์— ์žˆ๋Š” ์˜์–ด ๊ฐ์ • ์‚ฌ์ „์„ ์ด์šฉํ•˜์—ฌ ๊ตฌ์ถ•ํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆ๋œ ๊ฐ์ • ์ด์ง„ ๋ถ„๋ฅ˜๋Š” ๊ธฐ๊ณ„ ํ•™์Šต์„ ์ด์šฉํ•œ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ๊ธฐ๊ณ„ ํ•™์Šต์„ ์œ„ํ•ด์„œ๋Š” ํ•™์Šต ๋ง๋ญ‰์น˜๊ฐ€ ํ•„์š”ํ•œ๋ฐ ํŠน๋ณ„ํžˆ ๊ฐ์ • ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ๋Š” ๊ธ์ • ํ˜น์€ ๋ถ€์ • ๊ฐ์ • ํƒœ๊ทธ๊ฐ€ ๋ถ€์ฐฉ๋œ ๋ง๋ญ‰์น˜๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ์ด ๋ง๋ญ‰์น˜์˜ ๊ฒฝ์šฐ๋„, ๊ณต๊ฐœ๋œ ํ•œ๊ตญ์–ด ๊ฐ์ • ๋ง๋ญ‰์น˜๊ฐ€ ์•„์ง ์—†๊ธฐ ๋•Œ๋ฌธ์— ๋ง๋ญ‰์น˜๋ฅผ ์ง์ ‘ ๊ตฌ์ถ•ํ•˜์˜€๋‹ค. ์‚ฌ์šฉ๋œ ๊ธฐ๊ณ„ ํ•™์Šต ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” Na&iumlve Bayes, k-NN, SVM์ด ์žˆ๊ณ , ์ž์งˆ ์„ ํƒ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” Document Frequency, ฯ‡^2 statistic, Information Gain์ด ์žˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, ๋Œ“๊ธ€ ์•ˆ์— ํฌํ•จ๋œ ๊ฐ์ • ๋‹จ์–ด์™€ ๊ทธ ๋Œ“๊ธ€์— ๋Œ€ํ•œ ๊ธฐ์‚ฌ ๋ณธ๋ฌธ์ด ๊ฐ์ • ๋ถ„๋ฅ˜์— ๋งค์šฐ ํšจ๊ณผ์ ์ธ ์ž์งˆ์ž„์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.Chapter 1 Introduction 1 Chapter 2 Related Works 4 2.1 Sentiment Classification 4 2.2 Feature Weighting in Vector Space Model 5 2.3 Feature Extraction and Selection 7 2.4 Classifiers 10 2.5 Accuracy Measures 14 Chapter 3 Feature Reweighting 16 3.1 Feature extraction in Korean 16 3.2 Feature Reweighting Methods 17 3.3 Examples of Feature Reweighting Methods 18 Chapter 4 Sentiment Polarity Classification System 21 4.1 Model Generation 21 4.2 Sentiment Polarity Classification 23 Chapter 5 Data Preparation 25 5.1 Korean Sentiment Corpus 25 5.2 Korean Sentiment Lexicon 27 Chapter 6 Experiments 29 6.1 Experimental Environment 29 6.2 Experimental Results 30 Chapter 7 Conclusions and Future Works 38 Bibliography 40 Acknowledgments 4

    Computational Optimizations for Machine Learning

    Get PDF
    The present book contains the 10 articles finally accepted for publication in the Special Issue โ€œComputational Optimizations for Machine Learningโ€ of the MDPI journal Mathematics, which cover a wide range of topics connected to the theory and applications of machine learning, neural networks and artificial intelligence. These topics include, among others, various types of machine learning classes, such as supervised, unsupervised and reinforcement learning, deep neural networks, convolutional neural networks, GANs, decision trees, linear regression, SVM, K-means clustering, Q-learning, temporal difference, deep adversarial networks and more. It is hoped that the book will be interesting and useful to those developing mathematical algorithms and applications in the domain of artificial intelligence and machine learning as well as for those having the appropriate mathematical background and willing to become familiar with recent advances of machine learning computational optimization mathematics, which has nowadays permeated into almost all sectors of human life and activity
    corecore