3 research outputs found

    An evolutionary approach to optimising neural network predictors for passive sonar target tracking

    Get PDF
    Object tracking is important in autonomous robotics, military applications, financial time-series forecasting, and mobile systems. In order to correctly track through clutter, algorithms which predict the next value in a time series are essential. The competence of standard machine learning techniques to create bearing prediction estimates was examined. The results show that the classification based algorithms produce more accurate estimates than the state-of-the-art statistical models. Artificial Neural Networks (ANNs) and K-Nearest Neighbour were used, demonstrating that this technique is not specific to a single classifier. [Continues.

    Ensemble diversity for class imbalance learning

    Get PDF
    This thesis studies the diversity issue of classification ensembles for class imbalance learning problems. Class imbalance learning refers to learning from imbalanced data sets, in which some classes of examples (minority) are highly under-represented comparing to other classes (majority). The very skewed class distribution degrades the learning ability of many traditional machine learning methods, especially in the recognition of examples from the minority classes, which are often deemed to be more important and interesting. Although quite a few ensemble learning approaches have been proposed to handle the problem, no in-depth research exists to explain why and when they can be helpful. Our objectives are to understand how ensemble diversity affects the classification performance for a class imbalance problem according to single-class and overall performance measures, and to make best use of diversity to improve the performance. As the first stage, we study the relationship between ensemble diversity and generalization performance for class imbalance problems. We investigate mathematical links between single-class performance and ensemble diversity. It is found that how the single-class measures change along with diversity falls into six different situations. These findings are then verified in class imbalance scenarios through empirical studies. The impact of diversity on overall performance is also investigated empirically. Strong correlations between diversity and the performance measures are found. Diversity shows a positive impact on the recognition of the minority class and benefits the overall performance of ensembles in class imbalance learning. Our results help to understand if and why ensemble diversity can help to deal with class imbalance problems. Encouraged by the positive role of diversity in class imbalance learning, we then focus on a specific ensemble learning technique, the negative correlation learning (NCL) algorithm, which considers diversity explicitly when creating ensembles and has achieved great empirical success. We propose a new learning algorithm based on the idea of NCL, named AdaBoost.NC, for classification problems. An ``ambiguity" term decomposed from the 0-1 error function is introduced into the training framework of AdaBoost. It demonstrates superiority in both effectiveness and efficiency. Its good generalization performance is explained by theoretical and empirical evidences. It can be viewed as the first NCL algorithm specializing in classification problems. Most existing ensemble methods for class imbalance problems suffer from the problems of overfitting and over-generalization. To improve this situation, we address the class imbalance issue by making use of ensemble diversity. We investigate the generalization ability of NCL algorithms, including AdaBoost.NC, to tackle two-class imbalance problems. We find that NCL methods integrated with random oversampling are effective in recognizing minority class examples without losing the overall performance, especially the AdaBoost.NC tree ensemble. This is achieved by providing smoother and less overfitting classification boundaries for the minority class. The results here show the usefulness of diversity and open up a novel way to deal with class imbalance problems. Since the two-class imbalance is not the only scenario in real-world applications, multi-class imbalance problems deserve equal attention. To understand what problems multi-class can cause and how it affects the classification performance, we study the multi-class difficulty by analyzing the multi-minority and multi-majority cases respectively. Both lead to a significant performance reduction. The multi-majority case appears to be more harmful. The results reveal possible issues that a class imbalance learning technique could have when dealing with multi-class tasks. Following this part of analysis and the promising results of AdaBoost.NC on two-class imbalance problems, we apply AdaBoost.NC to a set of multi-class imbalance domains with the aim of solving them effectively and directly. Our method shows good generalization in minority classes and balances the performance across different classes well without using any class decomposition schemes. Finally, we conclude this thesis with how the study has contributed to class imbalance learning and ensemble learning, and propose several possible directions for future research that may improve and extend this work

    Optimisation multi-objective des problèmes combinatoires : application à la génération des horaires d'examens finaux

    Get PDF
    Les problèmes d'optimisation combinatoire discrète (POCD) sont des problèmes très difficiles à résoudre. La nature discrète des variables forme un espace non dérivable qui rendent inutiles les techniques basées sur le gradient. Le problème de la production des horaires est représentatif d'une famille de POCD. Il renferme un ensemble d'objectifs conflictuels, un ensemble de contraintes non linéaires et un nombre de combinaisons potentielles très élevé. De plus, un certain nombre d'institutions académiques produisent encore des horaires d'une manière manuelle ou semi-automatique. L'automatisation peut donc éliminer les aspects déplaisants de cette tâche. Ce travail porte sur l'optimisation combinatoire par algorithmes évolutifs et, plus précisément, sur les problèmes de création d'horaires des examens finaux (PCHE). Dans un premier temps, les POCD mono-critère et multi-critères sont décrits d'une manière formelle afin d'en établir les principales caractéristiques. Les méthodes qui ont été proposées pour la résolution d'un PCHE, tels que le recuit simulé, la fouille Tabou et les algorithmes génétiques ont fait l'objet d'une revue de la littérature. Afin de faire un lien avec les méthodes de résolutions multi-critères, il sera prouvé qu'un PCHE est lui aussi un POCD multi-critères. Jusqu'à maintenant, ce sont principalement les modèles mono-critères qui sont utilisés lors de la résolution de ce type de problème. Ainsi, l'étude qui a été entreprise dans ce travail s'est concentrée sur les différentes techniques d'optimisation multi-critères envisagables pour la résolution d'un PCHE. Parmi les algorithmes évolutifs multi-critères les plus populaires, ceux de la famille NSGA de Deb et du SPEA de Zitzler ont été susceptibles d'obtenir de bons résultats. Ces techniques, vues en détail dans ce travail, ont été implantées sur un problème de création des horaires d'examens finaux. Suite à une première série d'expérimentations, les algorithmes NSGA-II et SPEA-II se sont montrés inappropriés pour la résolution d'un PCHE à cause de la gestion des contraintes et à la diversité des solutions. L'ensemble des problèmes rencontrés lors de l'utilisation de ces algorithmes a permis la conception d'une nouvelle approche hybride. Cet algorithme évolutif hybride possède une structure semblable à celle de l'algorithme SPEA-II de Zitzler. La différence majeure est le remplacement de l'opérateur de croisement par deux fouilles Tabou. La première fouille est appliquée afin de réduire les violations de contraintes alors que la deuxième fouille Tabou permet l'optimisation de l'étalement temporel des examens des élèves. Les résultats obtenus avec l'ensemble des bases de données publiques ont démontré qu'en moyenne l'algorithme développé fonctionne très bien et présente un bon potentiel pour la réalisation de travaux futurs
    corecore