31,009 research outputs found

    Probabilistic Graphical Model Representation in Phylogenetics

    Get PDF
    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (1) reproducibility of an analysis, (2) model development and (3) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and non-specialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution

    Link Prediction in Complex Networks: A Survey

    Full text link
    Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labelled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.Comment: 44 pages, 5 figure

    The Complexity of Human Walking: A Knee Osteoarthritis Study

    Get PDF
    This study proposes a framework for deconstructing complex walking patterns to create a simple principal component space before checking whether the projection to this space is suitable for identifying changes from the normality. We focus on knee osteoarthritis, the most common knee joint disease and the second leading cause of disability. Knee osteoarthritis affects over 250 million people worldwide. The motivation for projecting the highly dimensional movements to a lower dimensional and simpler space is our belief that motor behaviour can be understood by identifying a simplicity via projection to a low principal component space, which may reflect upon the underlying mechanism. To study this, we recruited 180 subjects, 47 of which reported that they had knee osteoarthritis. They were asked to walk several times along a walkway equipped with two force plates that capture their ground reaction forces along 3 axes, namely vertical, anterior-posterior, and medio-lateral, at 1000 Hz. Data when the subject does not clearly strike the force plate were excluded, leaving 1–3 gait cycles per subject. To examine the complexity of human walking, we applied dimensionality reduction via Probabilistic Principal Component Analysis. The first principal component explains 34% of the variance in the data, whereas over 80% of the variance is explained by 8 principal components or more. This proves the complexity of the underlying structure of the ground reaction forces. To examine if our musculoskeletal system generates movements that are distinguishable between normal and pathological subjects in a low dimensional principal component space, we applied a Bayes classifier. For the tested cross-validated, subject-independent experimental protocol, the classification accuracy equals 82.62%. Also, a novel complexity measure is proposed, which can be used as an objective index to facilitate clinical decision making. This measure proves that knee osteoarthritis subjects exhibit more variability in the two-dimensional principal component space

    Calculating and understanding the value of any type of match evidence when there are potential testing errors

    Get PDF
    It is well known that Bayes’ theorem (with likelihood ratios) can be used to calculate the impact of evidence, such as a ‘match’ of some feature of a person. Typically the feature of interest is the DNA profile, but the method applies in principle to any feature of a person or object, including not just DNA, fingerprints, or footprints, but also more basic features such as skin colour, height, hair colour or even name. Notwithstanding concerns about the extensiveness of databases of such features, a serious challenge to the use of Bayes in such legal contexts is that its standard formulaic representations are not readily understandable to non-statisticians. Attempts to get round this problem usually involve representations based around some variation of an event tree. While this approach works well in explaining the most trivial instance of Bayes’ theorem (involving a single hypothesis and a single piece of evidence) it does not scale up to realistic situations. In particular, even with a single piece of match evidence, if we wish to incorporate the possibility that there are potential errors (both false positives and false negatives) introduced at any stage in the investigative process, matters become very complex. As a result we have observed expert witnesses (in different areas of speciality) routinely ignore the possibility of errors when presenting their evidence. To counter this, we produce what we believe is the first full probabilistic solution of the simple case of generic match evidence incorporating both classes of testing errors. Unfortunately, the resultant event tree solution is too complex for intuitive comprehension. And, crucially, the event tree also fails to represent the causal information that underpins the argument. In contrast, we also present a simple-to-construct graphical Bayesian Network (BN) solution that automatically performs the calculations and may also be intuitively simpler to understand. Although there have been multiple previous applications of BNs for analysing forensic evidence—including very detailed models for the DNA matching problem, these models have not widely penetrated the expert witness community. Nor have they addressed the basic generic match problem incorporating the two types of testing error. Hence we believe our basic BN solution provides an important mechanism for convincing experts—and eventually the legal community—that it is possible to rigorously analyse and communicate the full impact of match evidence on a case, in the presence of possible error
    • 

    corecore