66 research outputs found

    The use of vicinal-risk minimization for training decision trees

    Get PDF
    We propose the use of Vapnik's vicinal risk minimization (VRM) for training decision trees to approximately maximize decision margins. We implement VRM by propagating uncertainties in the input attributes into the labeling decisions. In this way, we perform a global regularization over the decision tree structure. During a training phase, a decision tree is constructed to minimize the total probability of misclassifying the labeled training examples, a process which approximately maximizes the margins of the resulting classifier. We perform the necessary minimization using an appropriate meta-heuristic (genetic programming) and present results over a range of synthetic and benchmark real datasets. We demonstrate the statistical superiority of VRM training over conventional empirical risk minimization (ERM) and the well-known C4.5 algorithm, for a range of synthetic and real datasets. We also conclude that there is no statistical difference between trees trained by ERM and using C4.5. Training with VRM is shown to be more stable and repeatable than by ERM

    Tikhonov Regularization as a Complexity Measure in Multiobjective Genetic Programming

    Get PDF
    © 1997-2012 IEEE. In this paper, we propose the use of Tikhonov regularization in conjunction with node count as a general complexity measure in multiobjective genetic programming. We demonstrate that employing this general complexity yields mean squared test error measures over a range of regression problems, which are typically superior to those from conventional node count (but never statistically worse). We also analyze the reason that our new method outperforms the conventional complexity measure and conclude that it forms a decision mechanism that balances both syntactic and semantic information

    Training genetic programming classifiers by vicinal-risk minimization

    Get PDF
    We propose and motivate the use of vicinal-risk minimization (VRM) for training genetic programming classifiers. We demonstrate that VRM has a number of attractive properties and demonstrate that it has a better correlation with generalization error compared to empirical risk minimization (ERM) so is more likely to lead to better generalization performance, in general. From the results of statistical tests over a range of real and synthetic datasets, we further demonstrate that VRM yields consistently superior generalization errors compared to conventional ERM

    Improve Deep Forest with Learnable Layerwise Augmentation Policy Schedule

    Full text link
    As a modern ensemble technique, Deep Forest (DF) employs a cascading structure to construct deep models, providing stronger representational power compared to traditional decision forests. However, its greedy multi-layer learning procedure is prone to overfitting, limiting model effectiveness and generalizability. This paper presents an optimized Deep Forest, featuring learnable, layerwise data augmentation policy schedules. Specifically, We introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate overfitting and develop a population-based search algorithm to tailor augmentation intensity for each layer. Additionally, we propose to incorporate outputs from intermediate layers into a checkpoint ensemble for more stable performance. Experimental results show that our method sets new state-of-the-art (SOTA) benchmarks in various tabular classification tasks, outperforming shallow tree ensembles, deep forests, deep neural network, and AutoML competitors. The learned policies also transfer effectively to Deep Forest variants, underscoring its potential for enhancing non-differentiable deep learning modules in tabular signal processing

    MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

    Full text link
    Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.Comment: 16 pages, Best Student Paper Award at UAI 202

    Modeling Uncertainty for Reliable Probabilistic Modeling in Deep Learning and Beyond

    Full text link
    [ES] Esta tesis se enmarca en la intersección entre las técnicas modernas de Machine Learning, como las Redes Neuronales Profundas, y el modelado probabilístico confiable. En muchas aplicaciones, no solo nos importa la predicción hecha por un modelo (por ejemplo esta imagen de pulmón presenta cáncer) sino también la confianza que tiene el modelo para hacer esta predicción (por ejemplo esta imagen de pulmón presenta cáncer con 67% probabilidad). En tales aplicaciones, el modelo ayuda al tomador de decisiones (en este caso un médico) a tomar la decisión final. Como consecuencia, es necesario que las probabilidades proporcionadas por un modelo reflejen las proporciones reales presentes en el conjunto al que se ha asignado dichas probabilidades; de lo contrario, el modelo es inútil en la práctica. Cuando esto sucede, decimos que un modelo está perfectamente calibrado. En esta tesis se exploran tres vias para proveer modelos más calibrados. Primero se muestra como calibrar modelos de manera implicita, que son descalibrados por técnicas de aumentación de datos. Se introduce una función de coste que resuelve esta descalibración tomando como partida las ideas derivadas de la toma de decisiones con la regla de Bayes. Segundo, se muestra como calibrar modelos utilizando una etapa de post calibración implementada con una red neuronal Bayesiana. Finalmente, y en base a las limitaciones estudiadas en la red neuronal Bayesiana, que hipotetizamos que se basan en un prior mispecificado, se introduce un nuevo proceso estocástico que sirve como distribución a priori en un problema de inferencia Bayesiana.[CA] Aquesta tesi s'emmarca en la intersecció entre les tècniques modernes de Machine Learning, com ara les Xarxes Neuronals Profundes, i el modelatge probabilístic fiable. En moltes aplicacions, no només ens importa la predicció feta per un model (per ejemplem aquesta imatge de pulmó presenta càncer) sinó també la confiança que té el model per fer aquesta predicció (per exemple aquesta imatge de pulmó presenta càncer amb 67% probabilitat). En aquestes aplicacions, el model ajuda el prenedor de decisions (en aquest cas un metge) a prendre la decisió final. Com a conseqüència, cal que les probabilitats proporcionades per un model reflecteixin les proporcions reals presents en el conjunt a què s'han assignat aquestes probabilitats; altrament, el model és inútil a la pràctica. Quan això passa, diem que un model està perfectament calibrat. En aquesta tesi s'exploren tres vies per proveir models més calibrats. Primer es mostra com calibrar models de manera implícita, que són descalibrats per tècniques d'augmentació de dades. S'introdueix una funció de cost que resol aquesta descalibració prenent com a partida les idees derivades de la presa de decisions amb la regla de Bayes. Segon, es mostra com calibrar models utilitzant una etapa de post calibratge implementada amb una xarxa neuronal Bayesiana. Finalment, i segons les limitacions estudiades a la xarxa neuronal Bayesiana, que es basen en un prior mispecificat, s'introdueix un nou procés estocàstic que serveix com a distribució a priori en un problema d'inferència Bayesiana.[EN] This thesis is framed at the intersection between modern Machine Learning techniques, such as Deep Neural Networks, and reliable probabilistic modeling. In many machine learning applications, we do not only care about the prediction made by a model (e.g. this lung image presents cancer) but also in how confident is the model in making this prediction (e.g. this lung image presents cancer with 67% probability). In such applications, the model assists the decision-maker (in this case a doctor) towards making the final decision. As a consequence, one needs that the probabilities provided by a model reflects the true underlying set of outcomes, otherwise the model is useless in practice. When this happens, we say that a model is perfectly calibrated. In this thesis three ways are explored to provide more calibrated models. First, it is shown how to calibrate models implicitly, which are decalibrated by data augmentation techniques. A cost function is introduced that solves this decalibration taking as a starting point the ideas derived from decision making with Bayes' rule. Second, it shows how to calibrate models using a post-calibration stage implemented with a Bayesian neural network. Finally, and based on the limitations studied in the Bayesian neural network, which we hypothesize that came from a mispecified prior, a new stochastic process is introduced that serves as a priori distribution in a Bayesian inference problem.Maroñas Molano, J. (2022). Modeling Uncertainty for Reliable Probabilistic Modeling in Deep Learning and Beyond [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/181582TESI

    Book reports

    Get PDF

    GPML: an XML-based standard for the interchange of genetic programming trees

    Get PDF
    We propose a Genetic Programming Markup Language (GPML), an XML based standard for the interchange of genetic programming trees, and outline the benefits such a format would bring in allowing the deployment of trained genetic programming (GP) models in applications as well as the subsidiary benefit of allowing GP researchers to directly share trained trees. We present a formal definition of this standard and describe details of an implementation. In addition, we present a case study where GPML is used to implement a model predictive controller for the control of a building heating plant

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin
    corecore