146 research outputs found
Techniques for text classification: Literature review and current trends
Automated classification of text into predefined categories has always been considered as a vital method to manage and process a vast amount of documents in digital forms that are widespread and continuously increasing. This kind of web information, popularly known as the digital/electronic information is in the form of documents, conference material, publications, journals, editorials, web pages, e-mail etc. People largely access information from these online sources rather than being limited to archaic paper sources like books, magazines, newspapers etc. But the main problem is that this enormous information lacks organization which makes it difficult to manage. Text classification is recognized as one of the key techniques used for organizing such kind of digital data. In this paper we have studied the existing work in the area of text classification which will allow us to have a fair evaluation of the progress made in this field till date. We have investigated the papers to the best of our knowledge and have tried to summarize all existing information in a comprehensive and succinct manner. The studies have been summarized in a tabular form according to the publication year considering numerous key perspectives. The main emphasis is laid on various steps involved in text classification process viz. document representation methods, feature selection methods, data mining methods and the evaluation technique used by each study to carry out the results on a particular dataset
Empowering One-vs-One Decomposition with Ensemble Learning for Multi-Class Imbalanced Data
Zhongliang Zhang was supported by the National Science Foundation of China (NSFC Proj. 61273204) and CSC Scholarship Program (CSC NO. 201406080059).
Bartosz Krawczyk was supported by the Polish National Science Center under the grant no. UMO-2015/19/B/ST6/01597.
Salvador Garcia and Francisco Herrera were partially supported by the Spanish Ministry of Education and Science under Project TIN2014-57251-P and the Andalusian Research Plan P10-TIC-6858, P11-TIC-7765.
Alejandro Rosales-Perez was supported by the CONACyT grant 329013.Multi-class imbalance classification problems occur in many real-world applications, which suffer from the quite different distribution of classes. Decomposition strategies are well-known techniques to address the classification problems involving multiple classes. Among them binary approaches using one-vs-one and one-vs-all has gained a significant attention from the research community. They allow to divide multi-class problems into several easier-to-solve two-class sub-problems. In this study we develop an exhaustive empirical analysis to explore the possibility of empowering the one-vs-one scheme for multi-class imbalance classification problems with applying binary ensemble learning approaches. We examine several state-of-the-art ensemble learning methods proposed for addressing the imbalance problems to solve the pairwise tasks derived from the multi-class data set. Then the aggregation strategy is employed to combine the binary ensemble outputs to reconstruct the original multi-class task. We present a detailed experimental study of the proposed approach, supported by the statistical analysis. The results indicate the high effectiveness of ensemble learning with one-vs-one scheme in dealing with the multi-class imbalance classification problems.National Natural Science Foundation of China (NSFC)
61273204CSC Scholarship Program (CSC)
201406080059Polish National Science Center
UMO-2015/19/B/ST6/01597Spanish Government
TIN2014-57251-PAndalusian Research Plan
P10-TIC-6858
P11-TIC-7765Consejo Nacional de Ciencia y Tecnologia (CONACyT)
32901
An empirical study on the various stock market prediction methods
Investment in the stock market is one of the much-admired investment actions. However, prediction of the stock market has remained a hard task because of the non-linearity exhibited. The non-linearity is due to multiple affecting factors such as global economy, political situations, sector performance, economic numbers, foreign institution investment, domestic institution investment, and so on. A proper set of such representative factors must be analyzed to make an efficient prediction model. Marginal improvement of prediction accuracy can be gainful for investors. This review provides a detailed analysis of research papers presenting stock market prediction techniques. These techniques are assessed in the time series analysis and sentiment analysis section. A detailed discussion on research gaps and issues is presented. The reviewed articles are analyzed based on the use of prediction techniques, optimization algorithms, feature selection methods, datasets, toolset, evaluation matrices, and input parameters. The techniques are further investigated to analyze relations of prediction methods with feature selection algorithm, datasets, feature selection methods, and input parameters. In addition, major problems raised in the present techniques are also discussed. This survey will provide researchers with deeper insight into various aspects of current stock market prediction methods
Role of Four-Chamber Heart Ultrasound Images in Automatic Assessment of Fetal Heart: A Systematic Understanding
The fetal echocardiogram is useful for monitoring and diagnosing cardiovascular diseases in the fetus in utero. Importantly, it can be used for assessing prenatal congenital heart disease, for which timely intervention can improve the unborn child's outcomes. In this regard, artificial intelligence (AI) can be used for the automatic analysis of fetal heart ultrasound images. This study reviews nondeep and deep learning approaches for assessing the fetal heart using standard four-chamber ultrasound images. The state-of-the-art techniques in the field are described and discussed. The compendium demonstrates the capability of automatic assessment of the fetal heart using AI technology. This work can serve as a resource for research in the field
Arabic Text Classification Using Learning Vector Quantization
Text classification aims to automatically assign document in predefined category. In our research, we used a model of neural network which is called Learning Vector Quantization (LVQ) for classifying Arabic text. This model has not been addressed before in this area. The model based on Kohonen self organizing map (SOM) that is able to organize vast document collections according to textual similarities. Also, from past experiences, the model requires less training examples and much faster than other classification methods. In this research we first selected Arabic documents from different domains. Then, we selected suitable pre-processing methods such as term weighting schemes, and Arabic morphological analysis (stemming and light stemming), to prepare the data set for achieving the classification by using the selected algorithm. After that, we compared the results obtained from different LVQ improvement version (LVQ2.1, LVQ3, OLVQ1 and OLVQ3). Finally, we compared our work with other most known classification algorithms; decision tree (DT), K Nearest Neighbors (KNN) and Naïve Bayes. The results presented that the LVQ's algorithms especially LVQ2.1 algorithm achieved high accuracy and less time rather than others classification algorithms and other neural networks algorithms
Forecasting Financial Distress With Machine Learning – A Review
Purpose – Evaluate the various academic researches with multiple views on credit risk and artificial intelligence (AI) and their evolution.Theoretical framework – The study is divided as follows: Section 1 introduces the article. Section 2 deals with credit risk and its relationship with computational models and techniques. Section 3 presents the methodology. Section 4 addresses a discussion of the results and challenges on the topic. Finally, section 5 presents the conclusions.Design/methodology/approach – A systematic review of the literature was carried out without defining the time period and using the Web of Science and Scopus database.Findings – The application of computational technology in the scope of credit risk analysis has drawn attention in a unique way. It was found that the demand for identification and introduction of new variables, classifiers and more assertive methods is constant. The effort to improve the interpretation of data and models is intense.Research, Practical & Social implications – It contributes to the verification of the theory, providing information in relation to the most used methods and techniques, it brings a wide analysis to deepen the knowledge of the factors and variables on the theme. It categorizes the lines of research and provides a summary of the literature, which serves as a reference, in addition to suggesting future research.Originality/value – Research in the area of Artificial Intelligence and Machine Learning is recent and requires attention and investigation, thus, this study contributes to the opening of new views in order to deepen the work on this topic
Recommended from our members
Machine Learning Stock Market Prediction Studies: Review and Research Directions
Stock market investment strategies are complex and rely on an evaluation of vast amounts of data. In recent years, machine learning techniques have increasingly been examined to assess whether they can improve market forecasting when compared with traditional approaches. The objective for this study is to identify directions for future machine learning stock market prediction research based upon a review of current literature. A systematic literature review methodology is used to identify relevant peer-reviewed journal articles from the past twenty years and categorize studies that have similar methods and contexts. Four categories emerge: artificial neural network studies, support vector machine studies, studies using genetic algorithms combined with other techniques, and studies using hybrid or other artificial intelligence approaches. Studies in each category are reviewed to identify common findings, unique findings, limitations, and areas that need further investigation. The final section provides overall conclusions and directions for future research
Recommended from our members
Application of Artificial Intelligence in predicting earthquakes: state-of-the-art and future challenges
Predicting the time, location and magnitude of an earthquake is a challenging job as an earthquake does not show specific patterns resulting in inaccurate predictions. Techniques based on Artificial Intelligence (AI) are well known for their capability to find hidden patterns in data. In the case of earthquake prediction, these models also produce a promising outcome. This work systematically explores the contributions made to date in earthquake prediction using AI-based techniques. A total of 84 scientific research papers, which reported the use of AI-based techniques in earthquake prediction, have been selected from different academic databases. These studies include a range of AI techniques including rule-based methods, shallow machine learning and deep learning algorithms. Covering all existing AI-based techniques in earthquake prediction, this paper provides an account of the available methodologies and a comparative analysis of their performances. The performance comparison has been reported from the perspective of used datasets and evaluation metrics. Furthermore, using comparative analysis of performances the paper aims to facilitate the selection of appropriate techniques for earthquake prediction. Towards the end, it outlines some open challenges and potential research directions in the field
- …