4 research outputs found

    Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach

    Get PDF
    Background:Breast cancer is the most prevalent cancer in women in most countries of the world. Many computer aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis. Methods:This study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository. Results: Our study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent. Contributions: The BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis

    Probabilistic Inference Using Partitioned Bayesian Networks:Introducing a Compositional Framework

    Get PDF
    Probability theory offers an intuitive and formally sound way to reason in situations that involve uncertainty. The automation of probabilistic reasoning has many applications such as predicting future events or prognostics, providing decision support, action planning under uncertainty, dealing with multiple uncertain measurements, making a diagnosis, and so forth. Bayesian networks in particular have been used to represent probability distributions that model the various applications of uncertainty reasoning. However, present-day automated reasoning approaches involving uncertainty struggle when models increase in size and complexity to fit real-world applications.In this thesis, we explore and extend a state-of-the-art automated reasoning method, called inference by Weighted Model Counting (WMC), when applied to increasingly complex Bayesian network models. WMC is comprised of two distinct phases: compilation and inference. The computational cost of compilation has limited the applicability of WMC. To overcome this limitation we have proposed theoretical and practical solutions that have been tested extensively in empirical studies using real-world Bayesian network models.We have proposed a weighted variant of OBDDs, called Weighted Positive Binary Decision Diagrams (WPBDD), which in turn is based on the new notion of positive Shannon decomposition. WPBDDs are particularly well suited to represent discrete probabilistic models. The conciseness of WPBDDs leads to a reduction in the cost of probabilistic inference.We have introduced Compositional Weighted Model Counting (CWMC), a language-agnostic framework for probabilistic inference that partitions a Bayesian network into subproblems. These subproblems are then compiled and subsequently composed in order to perform inference. This approach significantly reduces the cost of compilation, yet increases the cost of inference. The best results are obtained by seeking a partitioning that allows compilation to (barely) become feasible, but no more, as compilation cost can be amortized over multiple inference queries.Theoretical concepts have been implemented in a readily available open-source tool called ParaGnosis. Further implementational improvements have been found through parallelism, by exploiting independencies that are introduced by CWMC. The proposed methods combined push the boundaries of WMC, allowing this state-of-the-art method to be used on much larger models than before

    Understanding disease processes by partitioned dynamic Bayesian networks

    No full text
    Contains fulltext : 158122.pdf (publisher's version ) (Closed access

    Understanding disease processes by partitioned dynamic Bayesian networks

    No full text
    For many clinical problems in patients the underlying pathophysiological process changes in the course of time as a result of medical interventions. In model building for such problems, the typical scarcity of data in a clinical setting has been often compensated by utilizing time homogeneous models, such as dynamic Bayesian networks. As a consequence, the specificities of the underlying process are lost in the obtained models. In the current work, we propose the new concept of partitioned dynamic Bayesian networks to capture distribution regime changes, i.e. time non-homogeneity, benefiting from an intuitive and compact representation with the solid theoretical foundation of Bayesian network models. In order to balance specificity and simplicity in real-world scenarios, we propose a heuristic algorithm to search and learn these non-homogeneous models taking into account a preference for less complex models. An extensive set of experiments were ran, in which simulating experiments show that the heuristic algorithm was capable of constructing well-suited solutions, in terms of goodness of fit and statistical distance to the original distributions, in consonance with the underlying processes that generated data, whether it was homogeneous or non-homogeneous. Finally, a study case on psychotic depression was conducted using non-homogeneous models learned by the heuristic, leading to insightful answers for clinically relevant questions concerning the dynamics of this mental disorder. (C) 2016 Elsevier Inc. All rights reserved
    corecore