475 research outputs found

    Predicting Diabetes Onset: an Ensemble Supervised Learning Approach

    Get PDF
    An exploratory research is presented to gauge the impact of feature selection on heterogeneous ensembles. The task is to predict diabetes onset with healthcare data obtained from UC Irvine (UCI) database. Evidence suggests that accuracy and diversity are the two vital requirements to achieve good ensembles. Therefore, the research presented in this paper exploits diversity from heterogeneous base classifiers; and the optimisation effect of feature subset selection in order to improve accuracy. Five widely used classifiers are employed for the ensembles and a meta-classifier is used to aggregate their outputs. The results are presented and compared with similar studies that used the same dataset within the literature. It is shown that by using the proposed method, diabetes onset prediction can be done with higher accuracy

    Proposing a new method of image classification based on the AdaBoost deep belief network hybrid method

    Get PDF
    Image classification has different applications. Up to now, various algorithms have been presented for image classification. Each of these method has its own weaknesses and strengths. Reducing error rate is an issue which much researches have been carried out about it. This research intends to optimize the problem with hybrid methods and deep learning. The hybrid methods were developed to improve the results of the single-component methods. On the other hand, a deep belief network (DBN) is a generative probabilistic modelwith multiple layers of latent variables and is used to solve the unlabeled problems. In fact, this method is anunsupervised method, in which all layers are one-way directed layers except for the last layer. So far, various methods have been proposed for image classification, and the goal of this research project was to use a combination of the AdaBoost method and the deep belief network method to classify images. The other objective was to obtain better results than the previous results. In this project, a combination of the deep belief network and AdaBoost method was used to boost learning and the network potential was enhanced by making the entire network recursive. This method was tested on the MINIST dataset and the results were indicative of a decrease in the error rate with the proposed method as compared to the AdaBoost and deep belief network methods.

    Designing multiple classifier combinations a survey

    Get PDF
    Classification accuracy can be improved through multiple classifier approach. It has been proven that multiple classifier combinations can successfully obtain better classification accuracy than using a single classifier. There are two main problems in designing a multiple classifier combination which are determining the classifier ensemble and combiner construction. This paper reviews approaches in constructing the classifier ensemble and combiner. For each approach, methods have been reviewed and their advantages and disadvantages have been highlighted. A random strategy and majority voting are the most commonly used to construct the ensemble and combiner, respectively. The results presented in this review are expected to be a road map in designing multiple classifier combinations

    Investigating the use of an ensemble of evolutionary algorithms for letter identification in tremulous medieval handwriting

    Get PDF
    Ensemble classifiers are known for performing good generalization from simpler and less accurate classifiers. Ensembles have the ability to use the variety in classification patterns of the smaller classifiers in order to make better predictions. However, to create an ensemble it is necessary to determine how the component classifiers should be combined to generate the final predictions. One way to do this is to search different combinations of classifiers with evolutionary algorithms, which are largely employed when the objective is to find a structure that serves for some purpose. In this work, an investigation is carried about the use of ensembles obtained via evolutionary algorithm for identifying individual letters in tremulous medieval writing and to differentiate between scribes. The aim of this research is to use this process as the first step towards classifying the tremor type with more accuracy. The ensembles are obtained through evolutionary search of trees that aggregate the output of base classifiers, which are neural networks trained prior to the ensemble search. The misclassification patterns of the base classifiers are analysed in order to determine how much better an ensemble of those classifiers can be than its components. The best ensembles have their misclassification patterns compared to those of their component classifiers. The results obtained suggest interesting methods for letter (up to 96% accuracy) and user classification (up to 88% accuracy) in an offline scenario

    Static and dynamic selection of ensemble of classifiers

    Get PDF
    Nous présentons dans cette thÚse plusieurs solutions novatrices pour tenter de solutionner trois problÚmes fondamentaux reliés à la conception des ensembles de classifieurs: la génération des classificateurs, la sélection et la fusion. Une nouvelle fonction de fusion (Compound Diversity Function - CDF) basée sur la prise en compte de la performance individuelle des classificateurs et de la diversité entre pairs de classificateurs. Une nouvelle fonction de fusion basée sur les matrices de confusions "pairwise" (PFM), mieux adaptée pour la fusion des classificateurs en présence d'un grand nombre de classes. Une nouvelle méthode pour générer des ensembles de Mo- dÚles de Markov Cachés (Hidden Markov Models - EoHMM) pour la reconnaissance des caractÚres manuscrits. Une solution novatrice repose sur le concept des Oracles associés aux données de la base de validation (KNORA). Une nouvelle approche pour la sélection des sous-espaces de représentation à partir d'une mesure de diversité évaluée entre les paires de partitions

    Competitive Learning Neural Network Ensemble Weighted by Predicted Performance

    Get PDF
    Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different training signals for base networks in an ensemble can effectively promote diversity and improve ensemble performance. Here a Competitive Learning Neural Network Ensemble is proposed where a secondary output unit predicts the classification performance of the primary output unit in each base network. The networks compete with each other on the basis of classification performance and partition the stimulus space. The secondary units adaptively receive different training signals depending on the competition. As the result, each base network develops ¥°preference¥± over different regions of the stimulus space as indicated by their secondary unit outputs. To form an ensemble decision, all base networks¥¯ primary unit outputs are combined and weighted according to the secondary unit outputs. The effectiveness of the proposed approach is demonstrated with the experiments on one real-world and four artificial classification problems

    Design for novel enhanced weightless neural network and multi-classifier.

    Get PDF
    Weightless neural systems have often struggles in terms of speed, performances, and memory issues. There is also lack of sufficient interfacing of weightless neural systems to others systems. Addressing these issues motivates and forms the aims and objectives of this thesis. In addressing these issues, algorithms are formulated, classifiers, and multi-classifiers are designed, and hardware design of classifier are also reported. Specifically, the purpose of this thesis is to report on the algorithms and designs of weightless neural systems. A background material for the research is a weightless neural network known as Probabilistic Convergent Network (PCN). By introducing two new and different interfacing method, the word "Enhanced" is added to PCN thereby giving it the name Enhanced Probabilistic Convergent Network (EPCN). To solve the problem of speed and performances when large-class databases are employed in data analysis, multi-classifiers are designed whose composition vary depending on problem complexity. It also leads to the introduction of a novel gating function with application of EPCN as an intelligent combiner. For databases which are not very large, single classifiers suffices. Speed and ease of application in adverse condition were considered as improvement which has led to the design of EPCN in hardware. A novel hashing function is implemented and tested on hardware-based EPCN. Results obtained have indicated the utility of employing weightless neural systems. The results obtained also indicate significant new possible areas of application of weightless neural systems

    Misogyny Detection in Social Media on the Twitter Platform

    Get PDF
    The thesis is devoted to the problem of misogyny detection in social media. In the work we analyse the difference between all offensive language and misogyny language in social media, and review the best existing approaches to detect offensive and misogynistic language, which are based on classical machine learning and neural networks. We also review recent shared tasks aimed to detect misogyny in social media, several of which we have participated in. We propose an approach to the detection and classification of misogyny in texts, based on the construction of an ensemble of models of classical machine learning: Logistic Regression, Naive Bayes, Support Vectors Machines. Also, at the preprocessing stage we used some linguistic features, and novel approaches which allow us to improve the quality of classification. We tested the model on the real datasets both English and multilingual corpora. The results we achieved with our model are highly competitive in this area and demonstrate the capability for future improvement

    Two-Level Text Classification Using Hybrid Machine Learning Techniques

    Get PDF
    Nowadays, documents are increasingly being associated with multi-level category hierarchies rather than a flat category scheme. To access these documents in real time, we need fast automatic methods to navigate these hierarchies. Today’s vast data repositories such as the web also contain many broad domains of data which are quite distinct from each other e.g. medicine, education, sports and politics. Each domain constitutes a subspace of the data within which the documents are similar to each other but quite distinct from the documents in another subspace. The data within these domains is frequently further divided into many subcategories. Subspace Learning is a technique popular with non-text domains such as image recognition to increase speed and accuracy. Subspace analysis lends itself naturally to the idea of hybrid classifiers. Each subspace can be processed by a classifier best suited to the characteristics of that particular subspace. Instead of using the complete set of full space feature dimensions, classifier performances can be boosted by using only a subset of the dimensions. This thesis presents a novel hybrid parallel architecture using separate classifiers trained on separate subspaces to improve two-level text classification. The classifier to be used on a particular input and the relevant feature subset to be extracted is determined dynamically by using a novel method based on the maximum significance value. A novel vector representation which enhances the distinction between classes within the subspace is also developed. This novel system, the Hybrid Parallel Classifier, was compared against the baselines of several single classifiers such as the Multilayer Perceptron and was found to be faster and have higher two-level classification accuracies. The improvement in performance achieved was even higher when dealing with more complex category hierarchies

    Towards robust real-world historical handwriting recognition

    Get PDF
    In this thesis, we make a bridge from the past to the future by using artificial-intelligence methods for text recognition in a historical Dutch collection of the Natuurkundige Commissie that explored Indonesia (1820-1850). In spite of the successes of systems like 'ChatGPT', reading historical handwriting is still quite challenging for AI. Whereas GPT-like methods work on digital texts, historical manuscripts are only available as an extremely diverse collections of (pixel) images. Despite the great results, current DL methods are very data greedy, time consuming, heavily dependent on the human expert from the humanities for labeling and require machine-learning experts for designing the models. Ideally, the use of deep learning methods should require minimal human effort, have an algorithm observe the evolution of the training process, and avoid inefficient use of the already sparse amount of labeled data. We present several approaches towards dealing with these problems, aiming to improve the robustness of current methods and to improve the autonomy in training. We applied our novel word and line text recognition approaches on nine data sets differing in time period, language, and difficulty: three locally collected historical Latin-based data sets from Naturalis, Leiden; four public Latin-based benchmark data sets for comparability with other approaches; and two Arabic data sets. Using ensemble voting of just five neural networks, a level of accuracy was achieved which required hundreds of neural networks in earlier studies. Moreover, we increased the speed of evaluation of each training epoch without the need of labeled data
    • 

    corecore