10 research outputs found

    Influence modelling and learning between dynamic bayesian networks using score-based structure learning

    Get PDF
    A Ph.D. thesis submitted to the Faculty of Science, University of the Witwatersrand, in fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science May 2018Although partially observable stochastic processes are ubiquitous in many fields of science, little work has been devoted to discovering and analysing the means by which several such processes may interact to influence each other. In this thesis we extend probabilistic structure learning between random variables to the context of temporal models which represent partially observable stochastic processes. Learning an influence structure and distribution between processes can be useful for density estimation and knowledge discovery. A common approach to structure learning, in observable data, is score-based structure learning, where we search for the most suitable structure by using a scoring metric to value structural configurations relative to the data. Most popular structure scores are variations on the likelihood score which calculates the probability of the data given a potential structure. In observable data, the decomposability of the likelihood score, which is the ability to represent the score as a sum of family scores, allows for efficient learning procedures and significant computational saving. However, in incomplete data (either by latent variables or missing samples), the likelihood score is not decomposable and we have to perform inference to evaluate it. This forces us to use non-linear optimisation techniques to optimise the likelihood function. Furthermore, local changes to the network can affect other parts of the network, which makes learning with incomplete data all the more difficult. We define two general types of influence scenarios: direct influence and delayed influence which can be used to define influence around richly structured spaces; consisting of multiple processes that are interrelated in various ways. We will see that although it is possible to capture both types of influence in a single complex model by using a setting of the parameters, complex representations run into fragmentation issues. This is handled by extending the language of dynamic Bayesian networks to allow us to construct single compact models that capture the properties of a system’s dynamics, and produce influence distributions dynamically. The novelty and intuition of our approach is to learn the optimal influence structure in layers. We firstly learn a set of independent temporal models, and thereafter, optimise a structure score over possible structural configurations between these temporal models. Since the search for the optimal structure is done using complete data we can take advantage of efficient learning procedures from the structure learning literature. We provide the following contributions: we (a) introduce the notion of influence between temporal models; (b) extend traditional structure scores for random variables to structure scores for temporal models; (c) provide a complete algorithm to recover the influence structure between temporal models; (d) provide a notion of structural assembles to relate temporal models for types of influence; and finally, (e) provide empirical evidence for the effectiveness of our method with respect to generative ground-truth distributions. The presented results emphasise the trade-off between likelihood of an influence structure to the ground-truth and the computational complexity to express it. Depending on the availability of samples we might choose different learning methods to express influence relations between processes. On one hand, when given too few samples, we may choose to learn a sparse structure using tree-based structure learning or even using no influence structure at all. On the other hand, when given an abundant number of samples, we can use penalty-based procedures that achieve rich meaningful representations using local search techniques. Once we consider high-level representations of dynamic influence between temporal models, we open the door to very rich and expressive representations which emphasise the importance of knowledge discovery and density estimation in the temporal setting.MT 201

    Biomedical applications of belief networks

    Get PDF
    Biomedicine is an area in which computers have long been expected to play a significant role. Although many of the early claims have proved unrealistic, computers are gradually becoming accepted in the biomedical, clinical and research environment. Within these application areas, expert systems appear to have met with the most resistance, especially when applied to image interpretation.In order to improve the acceptance of computerised decision support systems it is necessary to provide the information needed to make rational judgements concerning the inferences the system has made. This entails an explanation of what inferences were made, how the inferences were made and how the results of the inference are to be interpreted. Furthermore there must be a consistent approach to the combining of information from low level computational processes through to high level expert analyses.nformation from low level computational processes through to high level expert analyses. Until recently ad hoc formalisms were seen as the only tractable approach to reasoning under uncertainty. A review of some of these formalisms suggests that they are less than ideal for the purposes of decision making. Belief networks provide a tractable way of utilising probability theory as an inference formalism by combining the theoretical consistency of probability for inference and decision making, with the ability to use the knowledge of domain experts.nowledge of domain experts. The potential of belief networks in biomedical applications has already been recog¬ nised and there has been substantial research into the use of belief networks for medical diagnosis and methods for handling large, interconnected networks. In this thesis the use of belief networks is extended to include detailed image model matching to show how, in principle, feature measurement can be undertaken in a fully probabilistic way. The belief networks employed are usually cyclic and have strong influences between adjacent nodes, so new techniques for probabilistic updating based on a model of the matching process have been developed.An object-orientated inference shell called FLAPNet has been implemented and used to apply the belief network formalism to two application domains. The first application is model-based matching in fetal ultrasound images. The imaging modality and biological variation in the subject make model matching a highly uncertain process. A dynamic, deformable model, similar to active contour models, is used. A belief network combines constraints derived from local evidence in the image, with global constraints derived from trained models, to control the iterative refinement of an initial model cue.In the second application a belief network is used for the incremental aggregation of evidence occurring during the classification of objects on a cervical smear slide as part of an automated pre-screening system. A belief network provides both an explicit domain model and a mechanism for the incremental aggregation of evidence, two attributes important in pre-screening systems.Overall it is argued that belief networks combine the necessary quantitative features required of a decision support system with desirable qualitative features that will lead to improved acceptability of expert systems in the biomedical domain

    Learning Bayesian networks based on optimization approaches

    Get PDF
    Learning accurate classifiers from preclassified data is a very active research topic in machine learning and artifcial intelligence. There are numerous classifier paradigms, among which Bayesian Networks are very effective and well known in domains with uncertainty. Bayesian Networks are widely used representation frameworks for reasoning with probabilistic information. These models use graphs to capture dependence and independence relationships between feature variables, allowing a concise representation of the knowledge as well as efficient graph based query processing algorithms. This representation is defined by two components: structure learning and parameter learning. The structure of this model represents a directed acyclic graph. The nodes in the graph correspond to the feature variables in the domain, and the arcs (edges) show the causal relationships between feature variables. A directed edge relates the variables so that the variable corresponding to the terminal node (child) will be conditioned on the variable corresponding to the initial node (parent). The parameter learning represents probabilities and conditional probabilities based on prior information or past experience. The set of probabilities are represented in the conditional probability table. Once the network structure is constructed, the probabilistic inferences are readily calculated, and can be performed to predict the outcome of some variables based on the observations of others. However, the problem of structure learning is a complex problem since the number of candidate structures grows exponentially when the number of feature variables increases. This thesis is devoted to the development of learning structures and parameters in Bayesian Networks. Different models based on optimization techniques are introduced to construct an optimal structure of a Bayesian Network. These models also consider the improvement of the Naive Bayes' structure by developing new algorithms to alleviate the independence assumptions. We present various models to learn parameters of Bayesian Networks; in particular we propose optimization models for the Naive Bayes and the Tree Augmented Naive Bayes by considering different objective functions. To solve corresponding optimization problems in Bayesian Networks, we develop new optimization algorithms. Local optimization methods are introduced based on the combination of the gradient and Newton methods. It is proved that the proposed methods are globally convergent and have superlinear convergence rates. As a global search we use the global optimization method, AGOP, implemented in the open software library GANSO. We apply the proposed local methods in the combination with AGOP. Therefore, the main contributions of this thesis include (a) new algorithms for learning an optimal structure of a Bayesian Network; (b) new models for learning the parameters of Bayesian Networks with the given structures; and finally (c) new optimization algorithms for optimizing the proposed models in (a) and (b). To validate the proposed methods, we conduct experiments across a number of real world problems. Print version is available at: http://library.federation.edu.au/record=b1804607~S4Doctor of Philosoph

    Model-Based Influence Diagrams For Machine Vision

    No full text
    corecore