11 research outputs found
Learning Bayesian networks from data by the incremental compilation of new network polynomials
Probabilistic graphical models are a huge research field in artificial intelligence nowadays. The scope of this work is the study of directed graphical models for the representation of discrete distributions. Two of the main research topics related to this area focus on performing inference over graphical models and on learning graphical models from data. Traditionally, the inference process and the learning process have been treated separately, but given that the learned models structure marks the inference complexity, this kind of strategies will sometimes produce very inefficient models. With the purpose of learning thinner models, in this master thesis we propose a new model for the representation of network polynomials, which
we call polynomial trees. Polynomial trees are a complementary representation for Bayesian networks that allows an efficient evaluation of the inference complexity
and provides a framework for exact inference. We also propose a set of methods for the incremental compilation of polynomial trees and an algorithm for learning
polynomial trees from data using a greedy score+search method that includes the inference complexity as a penalization in the scoring function
Learning tractable multidimensional Bayesian network classifiers
Multidimensional classification has become one of the most relevant topics in view of the many
domains that require a vector of class values to be assigned to a vector of given features. The
popularity of multidimensional Bayesian network classifiers has increased in the last few years
due to their expressive power and the existence of methods for learning different families of these
models. The problem with this approach is that the computational cost of using the learned models
is usually high, especially if there are a lot of class variables. Class-bridge decomposability means
that the multidimensional classification problem can be divided into multiple subproblems for these
models. In this paper, we prove that class-bridge decomposability can also be used to guarantee
the tractability of the models. We also propose a strategy for efficiently bounding their inference
complexity, providing a simple learning method with an order-based search that obtains tractable
multidimensional Bayesian network classifiers. Experimental results show that our approach is
competitive with other methods in the state of the art and ensures the tractability of the learned
models
Learning low inference complexity Bayesian networks
One of the main research topics in machine learning nowa-
days is the improvement of the inference and learning processes in proba-
bilistic graphical models. Traditionally, inference and learning have been
treated separately, but given that the structure of the model conditions
the inference complexity, most learning methods will sometimes produce
ineficient inference models. In this paper we propose a new representa-
tion for discrete probability distributions that allows eficiently evaluat-
ing the inference complexity of the models during the learning process.
We use this representation to create procedures for learning low infer-
ence complexity Bayesian networks. Experimental results show that the
proposed methods obtain tractable models that improve the accuracy of
the predictions provided by approximate inference in models obtained
with a well-known Bayesian network learner
Learning Bayesian networks with low inference complexity
One of the main research topics in machine
learning nowadays is the improvement of the inference and
learning processes in probabilistic graphical models. Traditionally,
inference and learning have been treated separately,
but given that the structure of the model conditions the
inference complexity, most learning methods will sometimes
produce inefficient inference models. In this paper we
propose a framework for learning low inference complexity
Bayesian networks. For that, we use a representation of
the network factorization that allows efficiently evaluating
an upper bound in the inference complexity of each model
during the learning process. Experimental results show that
the proposed methods obtain tractable models that improve
the accuracy of the predictions provided by approximate
inference in models obtained with a well-known Bayesian
network learner
Learning Tractable Bayesian Networks
Estamos en la era del aprendizaje automático y el descubrimiento automático de conocimientos a partir de datos se utiliza cada vez más para resolver problemas en nuestra vida diaria. Una clave para diseñar con éxito algoritmos inteligentes útiles es poder modelar la incertidumbre que está presente en el mundo real. Las redes bayesianas son una herramienta poderosa que modela la incertidumbre de acuerdo con la teorÃa de la probabilidad. Aunque la literatura contiene métodos que aprenden redes bayesianas a partir de conjuntos de datos con alta dimensionalidad, los métodos tradicionales no limitan la complejidad de inferencia de los modelos aprendidos, y a menudo producen modelos en los que la inferencia exacta es intratable. Esta tesis se centra en el aprendizaje de redes bayesianas tratables a partir de datos y contiene la siguientes cinco contribuciones: Primero, proponemos estrategias para aprender redes bayesianas en el espacio de órdenes de eliminación (i). De esta manera, podemos acotar de manera eficiente la complejidad de inferencia de las redes durante el proceso de aprendizaje. Esto es especialmente útil en problemas donde los datos están incompletos. En estos casos, el enfoque más común para aprender redes bayesianas es aplicar el algoritmo de esperanza-maximización estructural, que requiere realizar inferencia en cada iteración del proceso de aprendizaje. Aprovechando la reducida complejidad de inferencia de los modelos, proponemos un nuevo método con el propósito de garantizar la eficiencia del proceso de aprendizaje y a la vez mejorar el rendimiento del algoritmo original (ii). Además, estudiamos la complejidad de la clasificación multidimensional, una generalización de la clasificación multi-etiqueta, en redes bayesianas. Obtenemos cotas superiores para la complejidad del cómputo de las explicaciones más probables y de probabilidades marginales de las variables clase dado el valor de todas las variables predictoras (iii). Utilizamos estos lÃmites para proponer estrategias eficientes para acotar la complejidad de los clasificadores multidimensionales basados en redes bayesianas durante el proceso de aprendizaje. Con el objetivo de mejorar el rendimiento de estos modelos, proponemos métodos para el aprendizaje discriminativo de clasificadores multidimensionales basados en redes bayesianas (iv). Finalmente, abordamos el problema de predecir la ausencia de ataques en pacientes con epilepsia en el lóbulo temporal tras someterse a cirugÃa (v). Para ello, utilizamos un clasificador multidimensional basados en redes bayesianas, que están especialmente diseñados para modelar las relaciones entre las variables clase, es decir, ausencia de ataques epilépticos uno, dos y cinco años después de la cirugÃa. ----------ABSTRACT---------- We are in the era of machine learning and the automatic discovery of knowledge from data is increasingly used to solve problems in our daily life. A key to successfully design useful intelligent algorithms is to be able to model the uncertainty that is present in the real world. Bayesian networks are a powerful tool that models uncertainty according to probability theory. Although the literature contains approaches that learn Bayesian networks from high dimensional datasets, traditional methods do not bound the inference complexity of the learned models, often producing models where exact inference is intractable. This thesis focuses on learning tractable Bayesian networks from data, and contains the following five contributions. First, we propose strategies for learning Bayesian networks in the space of elimination orders (i). In this manner, we can efficiently bound the inference complexity of the networks during the learning process. This is specially useful in problems where data is incomplete. In these cases, the most common approach to learn Bayesian networks is to apply the structural expectation-maximization algorithm, which requires performing inference at each iteration of the learning process. Taking advantage of the reduced inference complexity of the models, we propose a new method with the purpose of guaranteeing the efficiency of the learning process while improving the performance of the original algorithm (ii). Additionally, we study the complexity of multidimensional classification, a generalization of multilabel classification, in Bayesian networks. We provide upper bounds for the complexity of the most probable explanations and marginals of class variables conditioned to an instantiation of all predictor variables (iii). We use these bounds to propose efficient strategies for limiting the complexity of multidimensional Bayesian network classifiers during the learning process. With the objective of improving the performance of these models, we also propose methods for the discriminative learning of multidimensional Bayesian network classifiers (iv). Finally, we address the problem of predicting seizure freedom in patients that have undergone temporal lobe epilepsy surgery (v). For that, we use a multidimensional Bayesian network classifier, which is specially well fitted to model the relationships among the class variables, i.e., seizure freedom at one, two and five years after surgery
Learning tractable Bayesian networks in the space of elimination orders
The computational complexity of inference is now one of the most relevant topics in the field of Bayesian networks. Although the literature contains approaches that learn Bayesian networks from high dimensional datasets, traditional methods do not bound the inference complexity of the learned models, often producing models where exact inference is intractable. This paper focuses on learning tractable Bayesian networks from data. To address this problem, we propose strategies for learning Bayesian networks in the space of elimination orders. In this manner, we can efficiently bound the inference complexity of the networks during the learning process. Searching in the combined space of directed acyclic graphs and elimination orders can be extremely computationally demanding. We demonstrate that one type of elimination trees, which we define as valid, can be used as an equivalence class of directed acyclic graphs and elimination orders, removing redundancy. We propose methods for incrementally compiling local changes made to directed acyclic graphs in elimination trees and for searching for elimination trees of low width. Using these methods, we can move through the space of valid elimination trees in polynomial time with respect to the number of network variables and in linear time with respect to treewidth. Experimental results show that our approach successfully bounds the inference complexity of the learned models, while it is competitive with other state-of-the-art methods in terms of fitting to data
Tractability of most probable explanations in multidimensional Bayesian network classifiers
Multidimensional Bayesian network classifiers have gained popularity over the last few years due to their expressive power and their intuitive graphical representation. A drawback of this approach is that their use to perform multidimensional classification, a generalization of multi-label classification, can be very computationally demanding when there are a large number of class variables. Thus, a key challenge in this field is to ensure the tractability of these models during the learning process. In this paper, we show how information about the most common queries of multidimensional Bayesian network classifiers affects the complexity of these models. We provide upper bounds for the complexity of the most probable explanations and marginals of class variables conditioned to an instantiation of all feature variables. We use these bounds to propose efficient strategies for bounding the complexity of multidimensional Bayesian network classifiers during the learning process, and provide a simple learning method with an order-based search that guarantees the tractability of the returned models. Experimental results show that our approach is competitive with other methods in the state of the art and also ensures the tractability of the learned models
Bounding the complexity of structural expectation-maximization
Structural expectation-maximization is the most common approach to address the problem of learning Bayesian networks from incomplete datasets. Its main limitation is that its computational cost is usually extremely demanding when the number of variables or the number of instances is not small. The bottleneck of this algorithm is the inference complexity of the model candidates. Thus, bounding the inference complexity of each Bayesian network during the learning process is key to make structural expectation-maximization efficient. In this paper, we propose a tractable adaptation of structural expectation-maximization and perform experiments to analyze its performance
Tractable learning of Bayesian networks from partially observed data
The majority of real-world problems require addressing incomplete data. The use of the structural expectation-maximization algorithm is the most common approach toward learning Bayesian networks from incomplete datasets. However, its main limitation is its demanding computational cost, caused mainly by the need to make an inference at each iteration of the algorithm. In this paper, we propose a new method with the purpose of guaranteeing the efficiency of the learning process while improving the performance of the structural expectation-maximization algorithm. We address the first objective by applying an upper bound to the treewidth of the models to limit the complexity of the inference. To achieve this, we use an efficient heuristic to search the space of the elimination orders. For the second objective, we study the advantages of directly computing the score with respect to the observed data rather than an expectation of the score, and provide a strategy to efficiently perform these computations in the proposed method. We perform exhaustive experiments on synthetic and real-world datasets of varied dimensionalities, including datasets with thousands of variables and hundreds of thousands of instances. The experimental results support our claims empirically
Bounding the complexity of structural expectation-maximization
Structural expectation-maximization is the most common approach to address the problem of learning Bayesian networks from incomplete datasets. Its main limitation is that its computational cost is usually extremely demanding when the number of variables or the number of instances is not small. The bottleneck of this algorithm is the inference complexity of the model candidates. Thus, bounding the inference complexity of each Bayesian network during the learning process is key to make structural expectation-maximization efficient. In this paper, we propose a tractable adaptation of structural expectation-maximization and perform experiments to analyze its performance