53 research outputs found

    Designing multiple classifier combinations a survey

    Get PDF
    Classification accuracy can be improved through multiple classifier approach. It has been proven that multiple classifier combinations can successfully obtain better classification accuracy than using a single classifier. There are two main problems in designing a multiple classifier combination which are determining the classifier ensemble and combiner construction. This paper reviews approaches in constructing the classifier ensemble and combiner. For each approach, methods have been reviewed and their advantages and disadvantages have been highlighted. A random strategy and majority voting are the most commonly used to construct the ensemble and combiner, respectively. The results presented in this review are expected to be a road map in designing multiple classifier combinations

    Applications of Mining Arabic Text: A Review

    Get PDF
    Since the appearance of text mining, the Arabic language gained some interest in applying several text mining tasks over a text written in the Arabic language. There are several challenges faced by the researchers. These tasks include Arabic text summarization, which is one of the challenging open areas for research in natural language processing (NLP) and text mining fields, Arabic text categorization, and Arabic sentiment analysis. This chapter reviews some of the past and current researches and trends in these areas and some future challenges that need to be tackled. It also presents some case studies for two of the reviewed approaches

    Water filtration by using apple and banana peels as activated carbon

    Get PDF
    Water filter is an important devices for reducing the contaminants in raw water. Activated from charcoal is used to absorb the contaminants. Fruit peels are some of the suitable alternative carbon to substitute the charcoal. Determining the role of fruit peels which were apple and banana peels powder as activated carbon in water filter is the main goal. Drying and blending the peels till they become powder is the way to allow them to absorb the contaminants. Comparing the results for raw water before and after filtering is the observation. After filtering the raw water, the reading for pH was 6.8 which is in normal pH and turbidity reading recorded was 658 NTU. As for the colour, the water becomes more clear compared to the raw water. This study has found that fruit peels such as banana and apple are an effective substitute to charcoal as natural absorbent

    Multiple proportion case-basing driven CBRE and its application in the evaluation of possible failure of firms

    Get PDF
    Case-based reasoning (CBR) is a unique tool for the evaluation of possible failure of firms (EOPFOF) for its eases of interpretation and implementation. Ensemble computing, a variation of group decision in society, provides a potential means of improving predictive performance of CBR-based EOPFOF. This research aims to integrate bagging and proportion case-basing with CBR to generate a method of proportion bagging CBR for EOPFOF. Diverse multiple case bases are first produced by multiple case-basing, in which a volume parameter is introduced to control the size of each case base. Then, the classic case retrieval algorithm is implemented to generate diverse member CBR predictors. Majority voting, the most frequently used mechanism in ensemble computing, is finally used to aggregate outputs of member CBR predictors in order to produce final prediction of the CBR ensemble. In an empirical experiment, we statistically validated the results of the CBR ensemble from multiple case bases by comparing them with those of multivariate discriminant analysis, logistic regression, classic CBR, the best member CBR predictor and bagging CBR ensemble. The results from Chinese EOPFOF prior to 3 years indicate that the new CBR ensemble, which significantly improved CBRs predictive ability, outperformed all the comparative methods

    Text Data Mining: Theory and Methods

    Full text link
    This paper provides the reader with a very brief introduction to some of the theory and methods of text data mining. The intent of this article is to introduce the reader to some of the current methodologies that are employed within this discipline area while at the same time making the reader aware of some of the interesting challenges that remain to be solved within the area. Finally, the articles serves as a very rudimentary tutorial on some of techniques while also providing the reader with a list of references for additional study.Comment: Published in at http://dx.doi.org/10.1214/07-SS016 the Statistics Surveys (http://www.i-journals.org/ss/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Rough set based ensemble classifier for web page classification

    Get PDF
    Combining the results of a number of individually trained classification systems to obtain a more accurate classifier is a widely used technique in pattern recognition. In this article, we have introduced a rough set based meta classifier to classify web pages. The proposed method consists of two parts. In the first part, the output of every individual classifier is considered for constructing a decision table. In the second part, rough set attribute reduction and rule generation processes are used on the decision table to construct a meta classifier. It has been shown that (1) the performance of the meta classifier is better than the performance of every constituent classifier and, (2) the meta classifier is optimal with respect to a quality measure defined in the article. Experimental studies show that the meta classifier improves accuracy of classification uniformly over some benchmark corpora and beats other ensemble approaches in accuracy by a decisive margin, thus demonstrating the theoretical results. Apart from this, it reduces the CPU load compared to other ensemble classification techniques by removing redundant classifiers from the combination

    Delineating Knowledge Domains in Scientific Domains in Scientific Literature using Machine Learning (ML)

    Get PDF
    The recent years have witnessed an upsurge in the number of published documents. Organizations are showing an increased interest in text classification for effective use of the information. Manual procedures for text classification can be fruitful for a handful of documents, but the same lack in credibility when the number of documents increases besides being laborious and time-consuming. Text mining techniques facilitate assigning text strings to categories rendering the process of classification fast, accurate, and hence reliable. This paper classifies chemistry documents using machine learning and statistical methods. The procedure of text classification has been described in chronological order like data preparation followed by processing, transformation, and application of classification techniques culminating in the validation of the results

    Arabic Text Mining

    Full text link
    The rapid growth of the internet has increased the number of online texts. This led to the rapid growth of the number of online texts in the Arabic language. The enormous amount of text must be organized into classes to make the analysis process and text retrieval easier. Text classification is, therefore, a key component of text mining. There are numerous systems and approaches for categorizing literature in English, European (French, German, Spanish), and Asian (Chinese, Japanese). In contrast, there are relatively few studies on categorizing Arabic literature due to the difficulty of the Arabic language. In this work, a brief explanation of key ideas relevant to Arabic text mining are introduced then a new classification system for the Arabic language is presented using light stemming and Classifier Na\"ive Bayesian (CNB). Texts from two classes: politics and sports, are included in our corpus. Some texts are added to the system, and the system correctly classified them, demonstrating the effectiveness of the system
    • …
    corecore