182 research outputs found

    On Aggregation in Ensembles of Multilabel Classifiers

    Full text link
    While a variety of ensemble methods for multilabel classification have been proposed in the literature, the question of how to aggregate the predictions of the individual members of the ensemble has received little attention so far. In this paper, we introduce a formal framework of ensemble multilabel classification, in which we distinguish two principal approaches: "predict then combine" (PTC), where the ensemble members first make loss minimizing predictions which are subsequently combined, and "combine then predict" (CTP), which first aggregates information such as marginal label probabilities from the individual ensemble members, and then derives a prediction from this aggregation. While both approaches generalize voting techniques commonly used for multilabel ensembles, they allow to explicitly take the target performance measure into account. Therefore, concrete instantiations of CTP and PTC can be tailored to concrete loss functions. Experimentally, we show that standard voting techniques are indeed outperformed by suitable instantiations of CTP and PTC, and provide some evidence that CTP performs well for decomposable loss functions, whereas PTC is the better choice for non-decomposable losses.Comment: 14 pages, 2 figure

    Otimização multi-objetivo em aprendizado de máquina

    Get PDF
    Orientador: Fernando José Von ZubenTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Regressão logística multinomial regularizada, classificação multi-rótulo e aprendizado multi-tarefa são exemplos de problemas de aprendizado de máquina em que objetivos conflitantes, como funções de perda e penalidades que promovem regularização, devem ser simultaneamente minimizadas. Portanto, a perspectiva simplista de procurar o modelo de aprendizado com o melhor desempenho deve ser substituída pela proposição e subsequente exploração de múltiplos modelos de aprendizado eficientes, cada um caracterizado por um compromisso (trade-off) distinto entre os objetivos conflitantes. Comitês de máquinas e preferências a posteriori do tomador de decisão podem ser implementadas visando explorar adequadamente este conjunto diverso de modelos de aprendizado eficientes, em busca de melhoria de desempenho. A estrutura conceitual multi-objetivo para aprendizado de máquina é suportada por três etapas: (1) Modelagem multi-objetivo de cada problema de aprendizado, destacando explicitamente os objetivos conflitantes envolvidos; (2) Dada a formulação multi-objetivo do problema de aprendizado, por exemplo, considerando funções de perda e termos de penalização como objetivos conflitantes, soluções eficientes e bem distribuídas ao longo da fronteira de Pareto são obtidas por um solver determinístico e exato denominado NISE (do inglês Non-Inferior Set Estimation); (3) Esses modelos de aprendizado eficientes são então submetidos a um processo de seleção de modelos que opera com preferências a posteriori, ou a filtragem e agregação para a síntese de ensembles. Como o NISE é restrito a problemas de dois objetivos, uma extensão do NISE capaz de lidar com mais de dois objetivos, denominada MONISE (do inglês Many-Objective NISE), também é proposta aqui, sendo uma contribuição adicional que expande a aplicabilidade da estrutura conceitual proposta. Para atestar adequadamente o mérito da nossa abordagem multi-objetivo, foram realizadas investigações mais específicas, restritas à aprendizagem de modelos lineares regularizados: (1) Qual é o mérito relativo da seleção a posteriori de um único modelo de aprendizado, entre os produzidos pela nossa proposta, quando comparado com outras abordagens de modelo único na literatura? (2) O nível de diversidade dos modelos de aprendizado produzidos pela nossa proposta é superior àquele alcançado por abordagens alternativas dedicadas à geração de múltiplos modelos de aprendizado? (3) E quanto à qualidade de predição da filtragem e agregação dos modelos de aprendizado produzidos pela nossa proposta quando aplicados a: (i) classificação multi-classe, (ii) classificação desbalanceada, (iii) classificação multi-rótulo, (iv) aprendizado multi-tarefa, (v) aprendizado com multiplos conjuntos de atributos? A natureza determinística de NISE e MONISE, sua capacidade de lidar adequadamente com a forma da fronteira de Pareto em cada problema de aprendizado, e a garantia de sempre obter modelos de aprendizado eficientes são aqui pleiteados como responsáveis pelos resultados promissores alcançados em todas essas três frentes de investigação específicasAbstract: Regularized multinomial logistic regression, multi-label classification, and multi-task learning are examples of machine learning problems in which conflicting objectives, such as losses and regularization penalties, should be simultaneously minimized. Therefore, the narrow perspective of looking for the learning model with the best performance should be replaced by the proposition and further exploration of multiple efficient learning models, each one characterized by a distinct trade-off among the conflicting objectives. Committee machines and a posteriori preferences of the decision-maker may be implemented to properly explore this diverse set of efficient learning models toward performance improvement. The whole multi-objective framework for machine learning is supported by three stages: (1) The multi-objective modelling of each learning problem, explicitly highlighting the conflicting objectives involved; (2) Given the multi-objective formulation of the learning problem, for instance, considering loss functions and penalty terms as conflicting objective functions, efficient solutions well-distributed along the Pareto front are obtained by a deterministic and exact solver named NISE (Non-Inferior Set Estimation); (3) Those efficient learning models are then subject to a posteriori model selection, or to ensemble filtering and aggregation. Given that NISE is restricted to two objective functions, an extension for many objectives, named MONISE (Many Objective NISE), is also proposed here, being an additional contribution and expanding the applicability of the proposed framework. To properly access the merit of our multi-objective approach, more specific investigations were conducted, restricted to regularized linear learning models: (1) What is the relative merit of the a posteriori selection of a single learning model, among the ones produced by our proposal, when compared with other single-model approaches in the literature? (2) Is the diversity level of the learning models produced by our proposal higher than the diversity level achieved by alternative approaches devoted to generating multiple learning models? (3) What about the prediction quality of ensemble filtering and aggregation of the learning models produced by our proposal on: (i) multi-class classification, (ii) unbalanced classification, (iii) multi-label classification, (iv) multi-task learning, (v) multi-view learning? The deterministic nature of NISE and MONISE, their ability to properly deal with the shape of the Pareto front in each learning problem, and the guarantee of always obtaining efficient learning models are advocated here as being responsible for the promising results achieved in all those three specific investigationsDoutoradoEngenharia de ComputaçãoDoutor em Engenharia Elétrica2014/13533-0FAPES

    CONTINUAL LEARNING FOR MULTI-LABEL DRIFTING DATA STREAMS USING HOMOGENEOUS ENSEMBLE OF SELF-ADJUSTING NEAREST NEIGHBORS

    Get PDF
    Multi-label data streams are sequences of multi-label instances arriving over time to a multi-label classifier. The properties of the data stream may continuously change due to concept drift. Therefore, algorithms must adapt constantly to the new data distributions. In this paper we propose a novel ensemble method for multi-label drifting streams named Homogeneous Ensemble of Self-Adjusting Nearest Neighbors (HESAkNN). It leverages a self-adjusting kNN as a base classifier with the advantages of ensembles to adapt to concept drift in the multi-label environment. To promote diverse knowledge within the ensemble, each base classifier is given a unique subset of features and samples to train on. These samples are distributed to classifiers in a probabilistic manner that follows a Poisson distribution as in online bagging. Accompanying these mechanisms, a collection of ADWIN detectors monitor each classifier for the occurrence of a concept drift. Upon detection, the algorithm automatically trains additional classifiers in the background to attempt to capture new concepts. After a pre-determined number of instances, both active and background classifiers are compared and only the most accurate classifiers are selected to populate the new active ensemble. The experimental study compares the proposed approach with 30 other classifiers including problem transformation, algorithm adaptation, kNNs, and ensembles on 30 diverse multi-label datasets and 11 performance metrics. Results validated using non-parametric statistical analysis support the better performance of the heterogeneous ensemble and highlights the contribution of the feature and instance diversity in improving the performance of the ensemble

    Study and implementation of quantum-inspired boosting algorithms for AI powered Financial Asset Management.

    Get PDF
    openL'Ensemble Learning (EL) è una tecnica di machine learning che prevede la combinazione di più modelli, chiamati weak learners, al fine di produrre previsioni più accurate. L'idea alla base dell'EL si basa sul fatto che aggregando le previsioni di più modelli, la previsione finale può essere più robusta, accurata e generalizzabile rispetto a quella di ciascun weak learner considerato singolarmente. Il boosting è una tecnica di EL in cui l'insieme di modelli viene costruito in modo iterativo, in modo tale che ad ogni iterazione l'addestramento di nuovi learners si concentri sulle istanze di addestramento sulle quali quali i modelli precedentemente selezionati sbagliano più frequentemente. Gli algoritmi di boosting sono stati applicati con successo in vari ambiti, tra cui il riconoscimento di immagini e oggetti, text mining, finanza e altri campi. Sono particolarmente efficaci in scenari in cui l'alta precisione e la stabilità sono cruciali, rendendoli uno strumento prezioso nel campo del machine learning. Qboost è un algoritmo di boosting introdotto per la prima volta da Neven et al. nel 2008, che trasforma il problema dell'EL in un problema di ottimizzazione combinatoria difficile che assume la forma di un problema QUBO (Quadratic Unconstrained Binary Optimization) o, equivalentemente, un'ottimizzazione del modello di Ising. Questo tipo di problema di ottimizzazione è NP-completo e quindi difficile da affrontare con metodi classici di calcolo digitale e algoritmi come il Simulated Annealing (SA). Pertanto, metodi computazionali alternativi, come quelli sviluppati nel contesto della computazione quantistica, sono di grande interesse per questa classe di problemi. In particolare, l’Adiabatic Quantum Annealing (AQA) è stato recentemente utilizzato per diverse dimostrazioni dell’efficacia di questo tipo di computazione in diversi ambiti quali la rilevazione di particelle, l’analisi di immagini aeree e alcune applicazioni finanziarie. La sua implementazione su processori ad atomi neutri, un tipo di adiabatic quantum hardware, ha fornito risultati promettenti in termini di utilità pratica e scalabilità. Questa tesi mira a sviluppare, testare e valutare un algoritmo basato su Qboost nel contesto dei problemi di classificazione multi-label. Lo studio e l'implementazione prendono in considerazione diversi algoritmi di ottimizzazione quantum-hybrid, quantum-inspired e tradizionali, nonché diverse soluzioni hardware, inclusi computer quantistici con processori ad atomi neutri. Il progetto si è sviluppato durante un'esperienza di stage presso Axyon AI, un'azienda FinTech che supporta gli asset manager attraverso la sua piattaforma software di machine learning. Axyon AI sfrutta l'ensemble learning e il boosting nella sua pipeline di machine learning. Lo scopo di questo progetto è fornire una dimostrazione di fattibilità riguardo il miglioramento delle prestazioni della fase di costruzione dell'ensemble rispetto all'algoritmo di EL attualmente impiegato dall’azienda. Le tecniche proposte agevolano una più ampia esplorazione dello spazio di configurazione dei weak learners, mirando a massimizzare le prestazioni e a cogliere eventuale potenziale precedentemente inesplorato.Ensemble Learning (EL) is a machine learning technique that involves combining multiple individual models, called weak learners, in order to produce more accurate predictions. The idea behind EL is that by aggregating the predictions of multiple models, the final prediction can be more robust, accurate, and generalizable than that of any of the single weak learners alone. Boosting is a powerful EL method in which the ensemble of models is constructed iteratively, so that at each iteration the training of new learners focuses on the training examples for which the previously selected models perform poorly. Boosting algorithms have been successfully applied to various domains, including image and object recognition, text mining, finance and a number of other fields. They are particularly effective in scenarios where high accuracy and stability are crucial, making them a valuable tool in the field of machine learning. Qboost is a boosting algorithm first introduced by Neven et al. in 2008 that casts the problem of EL into a hard combinatorial optimization problem that takes the form of a QUBO (Quadratic Unconstrained Binary Optimization) problem or, equivalently, an Ising model optimization. This kind of optimization problem is NP-complete and therefore difficult to tackle with classical digital computing methods and algorithms like simulated annealing (SA). Hence, alternative computational methods like the ones developed within the framework of quantum computing are of high interest for this class of problems. In particular, adiabatic quantum annealing (AQA) has been recently used for multiple demonstrations in the fields of particle detection, aerial imaging and financial applications. Its implementation on neutral atom processors, a type of adiabatic quantum hardware, has yielded promising results in terms of practical usefulness and scalability. This thesis aims to develop, test and benchmark a Qboost-based algorithm in the context of multilabel classification problems. The study and the implementation take into account several quantum-hybrid, quantum-inspired and traditional optimization algorithms as well as different hardware solutions, including quantum computers with neutral atom processors. The project matured during an internship experience at Axyon AI, a FinTech company that serves quantitative asset managers through its proprietary machine learning software platform. Axyon AI exploits ensemble learning and boosting in its machine learning pipeline. The scope of this project is to build a proof of concept for the improvement of the performance of the ensemble building step in the pipeline with respect to the currently employed EL algorithm. The proposed techniques facilitate a broader exploration of the configuration space of the weak learners, aiming to maximise performance and capture untapped potential

    Smart Bagged Tree-based Classifier optimized by Random Forests (SBT-RF) to Classify Brain- Machine Interface Data

    Get PDF
    Brain-Computer Interface (BCI) is a new technology that uses electrodes and sensors to connect machines and computers with the human brain to improve a person\u27s mental performance. Also, human intentions and thoughts are analyzed and recognized using BCI, which is then translated into Electroencephalogram (EEG) signals. However, certain brain signals may contain redundant information, making classification ineffective. Therefore, relevant characteristics are essential for enhancing classification performance. . Thus, feature selection has been employed to eliminate redundant data before sorting to reduce computation time. BCI Competition III Dataset Iva was used to investigate the efficacy of the proposed system. A Smart Bagged Tree-based Classifier (SBT-RF) technique is presented to determine the importance of the features for selecting and classifying the data. As a result, SBT-RF is better at improving the mean accuracy of the dataset. It also decreases computation cost and training time and increases prediction speed. Furthermore, fewer features mean fewer electrodes, thus lowering the risk of damage to the brain. The proposed algorithm has the greatest average accuracy of ~98% compared to other relevant algorithms in the literature. SBT-RF is compared to state-of-the-art algorithms based on the following performance metrics: Confusion Matrix, ROC-AUC, F1-Score, Training Time, Prediction speed, and Accuracy
    corecore