Search CORE

34 research outputs found

Data-Driven Bayesian Network Learning: A Bi-Objective Approach to Address the Bias-Variance Decomposition

Author: Efrén Mezura-Montes
Nicandro Cruz-Ramírez
Vicente-Josué Aguilera-Rueda
Publication venue: 'MDPI AG'
Publication date: 20/06/2020
Field of study

We present a novel bi-objective approach to address the data-driven learning problem of Bayesian networks. Both the log-likelihood and the complexity of each candidate Bayesian network are considered as objectives to be optimized by our proposed algorithm named Nondominated Sorting Genetic Algorithm for learning Bayesian networks (NS2BN) which is based on the well-known NSGA-II algorithm. The core idea is to reduce the implicit selection bias-variance decomposition while identifying a set of competitive models using both objectives. Numerical results suggest that, in stark contrast to the single-objective approach, our bi-objective approach is useful to find competitive Bayesian networks especially in the complexity. Furthermore, our approach presents the end user with a set of solutions by showing different Bayesian network and their respective MDL and classification accuracy results

Multidisciplinary Digital Publishing Institute

How good is crude MDL for solving the bias-variance dilemma? An empirical investigation based on Bayesian networks.

Author: Alejandro Guerra-Hernández
Efrén Mezura-Montes
Elva María Novoa-del-Toro
Guillermo de Jesús Hoyos-Rivera
Héctor Gabriel Acosta-Mesa
Karina Gutiérrez-Fragoso
Luis Alonso Nava-Fernández
María Yaneli Ameca-Alducin
Nicandro Cruz-Ramírez
Patricia González-Gaspar
Rocío Erandi Barrientos-Martínez
Vicente Josué Aguilera-Rueda
Publication venue: Public Library of Science (PLoS)
Publication date: 26/03/2014
Field of study

The bias-variance dilemma is a well-known and important problem in Machine Learning. It basically relates the generalization capability (goodness of fit) of a learning method to its corresponding complexity. When we have enough data at hand, it is possible to use these data in such a way so as to minimize overfitting (the risk of selecting a complex model that generalizes poorly). Unfortunately, there are many situations where we simply do not have this required amount of data. Thus, we need to find methods capable of efficiently exploiting the available data while avoiding overfitting. Different metrics have been proposed to achieve this goal: the Minimum Description Length principle (MDL), Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC), among others. In this paper, we focus on crude MDL and empirically evaluate its performance in selecting models with a good balance between goodness of fit and complexity: the so-called bias-variance dilemma, decomposition or tradeoff. Although the graphical interaction between these dimensions (bias and variance) is ubiquitous in the Machine Learning literature, few works present experimental evidence to recover such interaction. In our experiments, we argue that the resulting graphs allow us to gain insights that are difficult to unveil otherwise: that crude MDL naturally selects balanced models in terms of bias-variance, which not necessarily need be the gold-standard ones. We carry out these experiments using a specific model: a Bayesian network. In spite of these motivating results, we also should not overlook three other components that may significantly affect the final model selection: the search procedure, the noise rate and the sample size

Directory of Open Access Journals

PubMed Central

LaDonna (House) Moore poses with others on couch

Author: Alejandro Guerra-Hernández (543067)
Efrén Mezura-Montes (543066)
Elva María Novoa-del-Toro (543073)
Guillermo de Jesús Hoyos-Rivera (543068)
Héctor Gabriel Acosta-Mesa (543065)
Karina Gutiérrez-Fragoso (543070)
Luis Alonso Nava-Fernández (543071)
María Yaneli Ameca-Alducin (543075)
Nicandro Cruz-Ramírez (543064)
Patricia González-Gaspar (543072)
Rocío Erandi Barrientos-Martínez (543069)
Vicente Josué Aguilera-Rueda (543074)
Publication venue: Digital Commons @ George Fox University
Publication date: 01/01/1980
Field of study

LaDonna (House) Moore (top left) poses with others on couch, c.1980\u27shttps://digitalcommons.georgefox.edu/gfu_photos_1980_1984/1103/thumbnail.jp

Digital Commons @ George Fox University

FigShare

Exhaustive evaluation of BIC (low-entropy values).

Author: Alejandro Guerra-Hernández (543067)
Efrén Mezura-Montes (543066)
Elva María Novoa-del-Toro (543073)
Guillermo de Jesús Hoyos-Rivera (543068)
Héctor Gabriel Acosta-Mesa (543065)
Karina Gutiérrez-Fragoso (543070)
Luis Alonso Nava-Fernández (543071)
María Yaneli Ameca-Alducin (543075)
Nicandro Cruz-Ramírez (543064)
Patricia González-Gaspar (543072)
Rocío Erandi Barrientos-Martínez (543069)
Vicente Josué Aguilera-Rueda (543074)
Publication venue
Publication date: 26/03/2014
Field of study

Exhaustive evaluation of BIC (low-entropy values).</p

FigShare

Santa Clara

Author: Alejandro Guerra-Hernández (543067)
Efrén Mezura-Montes (543066)
Elva María Novoa-del-Toro (543073)
Guillermo de Jesús Hoyos-Rivera (543068)
Héctor Gabriel Acosta-Mesa (543065)
Karina Gutiérrez-Fragoso (543070)
Luis Alonso Nava-Fernández (543071)
María Yaneli Ameca-Alducin (543075)
Nicandro Cruz-Ramírez (543064)
Patricia González-Gaspar (543072)
Rocío Erandi Barrientos-Martínez (543069)
Vicente Josué Aguilera-Rueda (543074)
Publication venue: Santa Clara University Library, Archives & Special Collections
Publication date
Field of study

An archive of the Santa Clara University student newspaper from Santa Clara University in Californi

SCU Digital Collections

FigShare

Graph with best value (AIC, MDL, BIC - random distribution).

Author: Alejandro Guerra-Hernández (543067)
Efrén Mezura-Montes (543066)
Elva María Novoa-del-Toro (543073)
Guillermo de Jesús Hoyos-Rivera (543068)
Héctor Gabriel Acosta-Mesa (543065)
Karina Gutiérrez-Fragoso (543070)
Luis Alonso Nava-Fernández (543071)
María Yaneli Ameca-Alducin (543075)
Nicandro Cruz-Ramírez (543064)
Patricia González-Gaspar (543072)
Rocío Erandi Barrientos-Martínez (543069)
Vicente Josué Aguilera-Rueda (543074)
Publication venue
Publication date
Field of study

Graph with best value (AIC, MDL, BIC - random distribution).</p

FigShare

Gold-standard Network.

Author: Alejandro Guerra-Hernández (543067)
Efrén Mezura-Montes (543066)
Elva María Novoa-del-Toro (543073)
Guillermo de Jesús Hoyos-Rivera (543068)
Héctor Gabriel Acosta-Mesa (543065)
Karina Gutiérrez-Fragoso (543070)
Luis Alonso Nava-Fernández (543071)
María Yaneli Ameca-Alducin (543075)
Nicandro Cruz-Ramírez (543064)
Patricia González-Gaspar (543072)
Rocío Erandi Barrientos-Martínez (543069)
Vicente Josué Aguilera-Rueda (543074)
Publication venue
Publication date
Field of study

Gold-standard Network.</p

FigShare

Graph with best MDL and BIC value.

Author: Alejandro Guerra-Hernández (543067)
Efrén Mezura-Montes (543066)
Elva María Novoa-del-Toro (543073)
Guillermo de Jesús Hoyos-Rivera (543068)
Héctor Gabriel Acosta-Mesa (543065)
Karina Gutiérrez-Fragoso (543070)
Luis Alonso Nava-Fernández (543071)
María Yaneli Ameca-Alducin (543075)
Nicandro Cruz-Ramírez (543064)
Patricia González-Gaspar (543072)
Rocío Erandi Barrientos-Martínez (543069)
Vicente Josué Aguilera-Rueda (543074)
Publication venue
Publication date
Field of study

Graph with best MDL and BIC value.</p

FigShare

Algorithm for randomly generating raw sample data.

Author: Alejandro Guerra-Hernández (543067)
Efrén Mezura-Montes (543066)
Elva María Novoa-del-Toro (543073)
Guillermo de Jesús Hoyos-Rivera (543068)
Héctor Gabriel Acosta-Mesa (543065)
Karina Gutiérrez-Fragoso (543070)
Luis Alonso Nava-Fernández (543071)
María Yaneli Ameca-Alducin (543075)
Nicandro Cruz-Ramírez (543064)
Patricia González-Gaspar (543072)
Rocío Erandi Barrientos-Martínez (543069)
Vicente Josué Aguilera-Rueda (543074)
Publication venue
Publication date
Field of study

Algorithm for randomly generating raw sample data.</p

FigShare

Minimum MDL2 values (random distribution).

Author: Alejandro Guerra-Hernández (543067)
Efrén Mezura-Montes (543066)
Elva María Novoa-del-Toro (543073)
Guillermo de Jesús Hoyos-Rivera (543068)
Héctor Gabriel Acosta-Mesa (543065)
Karina Gutiérrez-Fragoso (543070)
Luis Alonso Nava-Fernández (543071)
María Yaneli Ameca-Alducin (543075)
Nicandro Cruz-Ramírez (543064)
Patricia González-Gaspar (543072)
Rocío Erandi Barrientos-Martínez (543069)
Vicente Josué Aguilera-Rueda (543074)
Publication venue
Publication date
Field of study

The red dot indicates the BN structure of <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0092866#pone-0092866-g022" target="_blank">Figure 22</a> whereas the green dot indicates the MDL2 value of the gold-standard network (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0092866#pone-0092866-g009" target="_blank">Figure 9</a>). The distance between these two networks = 0.00018701910455 (computed as the log2 of the ratio of gold-standard network/minimum network). A value bigger than 0 means that the minimum network has better MDL2 than the gold-standard.</p

FigShare