3 research outputs found

    Modeling and analysis of disease and risk factors through learning Bayesian networks from observational data

    Full text link
    This paper focuses on identification of the relationships between a disease and its potential risk factors using Bayesian networks in an epidemiologic study, with the emphasis on integrating medical domain knowledge and statistical data analysis. An integrated approach is developed to identify the risk factors associated with patients' occupational histories and is demonstrated using real-world data. This approach includes several steps. First, raw data are preprocessed into a format that is acceptable to the learning algorithms of Bayesian networks. Some important considerations are discussed to address the uniqueness of the data and the challenges of the learning. Second, a Bayesian network is learned from the preprocessed data set by integrating medical domain knowledge and generic learning algorithms. Third, the relationships revealed by the Bayesian network are used for risk factor analysis, including identification of a group of people who share certain common characteristics and have a relatively high probability of developing the disease, and prediction of a person's risk of developing the disease given information on his/her occupational history. Copyright © 2007 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/58076/1/893_ftp.pd

    Discovering Causal Dependencies in Mobile Context-Aware Recommenders

    Get PDF

    A Study of Causal Discovery With Weak Links and Small Samples

    No full text
    Weak causal relationships and small sample size pose two significant difficulties to the automatic discovery of causal models from observational data. This paper examines the influence of weak causal links and varying sample sizes on the discovery of causal models. The experimental results illustrate the effect of larger sample sizes for discovering causal models reliably and the relevance of the strength of causal links and the complexity of the original causal model. We present indicative evidence of the superior robustness of MML (Minimum Message Length) methods to standard significance tests in the recovery of causal links. The comparative results show that the MML-CI (the MML Causal Inducer) causal discovery system finds better models than TETRAD II given small samples from linear causal models. The experimental results also reveal that MML-CI finds weak links with smaller sample sizes than can TETRAD II
    corecore