283 research outputs found

    Simulations and cosmological inference: A statistical model for power spectra means and covariances

    Full text link
    We describe an approximate statistical model for the sample variance distribution of the non-linear matter power spectrum that can be calibrated from limited numbers of simulations. Our model retains the common assumption of a multivariate Normal distribution for the power spectrum band powers, but takes full account of the (parameter dependent) power spectrum covariance. The model is calibrated using an extension of the framework in Habib et al. (2007) to train Gaussian processes for the power spectrum mean and covariance given a set of simulation runs over a hypercube in parameter space. We demonstrate the performance of this machinery by estimating the parameters of a power-law model for the power spectrum. Within this framework, our calibrated sample variance distribution is robust to errors in the estimated covariance and shows rapid convergence of the posterior parameter constraints with the number of training simulations.Comment: 14 pages, 3 figures, matches final version published in PR

    PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Phylogenies, i.e., the evolutionary histories of groups of taxa, play a major role in representing the interrelationships among biological entities. Many software tools for reconstructing and evaluating such phylogenies have been proposed, almost all of which assume the underlying evolutionary history to be a tree. While trees give a satisfactory first-order approximation for many families of organisms, other families exhibit evolutionary mechanisms that cannot be represented by trees. Processes such as horizontal gene transfer (HGT), hybrid speciation, and interspecific recombination, collectively referred to as <it>reticulate evolutionary events</it>, result in <it>networks</it>, rather than trees, of relationships. Various software tools have been recently developed to analyze reticulate evolutionary relationships, which include SplitsTree4, LatTrans, EEEP, HorizStory, and T-REX.</p> <p>Results</p> <p>In this paper, we report on the PhyloNet software package, which is a suite of tools for analyzing reticulate evolutionary relationships, or <it>evolutionary networks</it>, which are rooted, directed, acyclic graphs, leaf-labeled by a set of taxa. These tools can be classified into four categories: (1) evolutionary network representation: reading/writing evolutionary networks in a newly devised compact form; (2) evolutionary network characterization: analyzing evolutionary networks in terms of three basic building blocks – trees, clusters, and tripartitions; (3) evolutionary network comparison: comparing two evolutionary networks in terms of topological dissimilarities, as well as fitness to sequence evolution under a maximum parsimony criterion; and (4) evolutionary network reconstruction: reconstructing an evolutionary network from a species tree and a set of gene trees.</p> <p>Conclusion</p> <p>The software package, PhyloNet, offers an array of utilities to allow for efficient and accurate analysis of evolutionary networks. The software package will help significantly in analyzing large data sets, as well as in studying the performance of evolutionary network reconstruction methods. Further, the software package supports the proposed eNewick format for compact representation of evolutionary networks, a feature that allows for efficient interoperability of evolutionary network software tools. Currently, all utilities in PhyloNet are invoked on the command line.</p

    Circular Networks from Distorted Metrics

    Full text link
    Trees have long been used as a graphical representation of species relationships. However complex evolutionary events, such as genetic reassortments or hybrid speciations which occur commonly in viruses, bacteria and plants, do not fit into this elementary framework. Alternatively, various network representations have been developed. Circular networks are a natural generalization of leaf-labeled trees interpreted as split systems, that is, collections of bipartitions over leaf labels corresponding to current species. Although such networks do not explicitly model specific evolutionary events of interest, their straightforward visualization and fast reconstruction have made them a popular exploratory tool to detect network-like evolution in genetic datasets. Standard reconstruction methods for circular networks, such as Neighbor-Net, rely on an associated metric on the species set. Such a metric is first estimated from DNA sequences, which leads to a key difficulty: distantly related sequences produce statistically unreliable estimates. This is problematic for Neighbor-Net as it is based on the popular tree reconstruction method Neighbor-Joining, whose sensitivity to distance estimation errors is well established theoretically. In the tree case, more robust reconstruction methods have been developed using the notion of a distorted metric, which captures the dependence of the error in the distance through a radius of accuracy. Here we design the first circular network reconstruction method based on distorted metrics. Our method is computationally efficient. Moreover, the analysis of its radius of accuracy highlights the important role played by the maximum incompatibility, a measure of the extent to which the network differs from a tree.Comment: Submitte

    Phylogenetic networks: modeling, reconstructibility, and accuracy

    Get PDF
    Phylogenetic networks model the evolutionary history of sets of organisms when events such as hybrid speciation and horizontal gene transfer occur. In spite of their widely acknowledged importance in evolutionary biology, phylogenetic networks have so far been studied mostly for specific data sets. We present a general definition of phylogenetic networks in terms of directed acyclic graphs (DAGs) and a set of conditions. Further, we distinguish between model networks and reconstructible ones and characterize the effect of extinction and taxon sampling on the reconstructibility of the network. Simulation studies are a standard technique for assessing the performance of phylogenetic methods. A main step in such studies entails quantifying the topological error between the model and inferred phylogenies. While many measures of tree topological accuracy have been proposed, none exist for phylogenetic networks. Previously, we proposed the first such measure, which applied only to a restricted class of networks. In this paper, we extend that measure to apply to all networks, and prove that it is a metric on the space of phylogenetic networks. Our results allow for the systematic study of existing network methods, and for the design of new accurate ones

    An Evaluation of Methods for Inferring Boolean Networks from Time-Series Data

    Get PDF
    Regulatory networks play a central role in cellular behavior and decision making. Learning these regulatory networks is a major task in biology, and devising computational methods and mathematical models for this task is a major endeavor in bioinformatics. Boolean networks have been used extensively for modeling regulatory networks. In this model, the state of each gene can be either ‘on’ or ‘off’ and that next-state of a gene is updated, synchronously or asynchronously, according to a Boolean rule that is applied to the current-state of the entire system. Inferring a Boolean network from a set of experimental data entails two main steps: first, the experimental time-series data are discretized into Boolean trajectories, and then, a Boolean network is learned from these Boolean trajectories. In this paper, we consider three methods for data discretization, including a new one we propose, and three methods for learning Boolean networks, and study the performance of all possible nine combinations on four regulatory systems of varying dynamics complexities. We find that employing the right combination of methods for data discretization and network learning results in Boolean networks that capture the dynamics well and provide predictive power. Our findings are in contrast to a recent survey that placed Boolean networks on the low end of the ‘‘faithfulness to biological reality’’ and ‘‘ability to model dynamics’’ spectra. Further, contrary to the common argument in favor of Boolean networks, we find that a relatively large number of time points in the timeseries data is required to learn good Boolean networks for certain data sets. Last but not least, while methods have been proposed for inferring Boolean networks, as discussed above, missing still are publicly available implementations thereof. Here, we make our implementation of the methods available publicly in open source at http://bioinfo.cs.rice.edu/

    High School Students' Proficiency and Confidence Levels in Displaying Their Understanding of Basic Electrolysis Concepts

    Get PDF
    This study was conducted with 330 Form 4 (grade 10) students (aged 15 – 16 years) who were involved in a course of instruction on electrolysis concepts. The main purposes of this study were (1) to assess high school chemistry students’ understanding of 19 major principles of electrolysis using a recently developed 2-tier multiple-choice diagnostic instrument, the Electrolysis Diagnostic Instrument (EDI), and (2) to assess students’ confidence levels in displaying their knowledge and understanding of these electrolysis concepts. Analysis of students’ responses to the EDI showed that they displayed very limited understanding of the electrolytic processes involving molten compounds and aqueous solutions of compounds, with a mean score of 6.82 (out of a possible maximum of 17). Students were found to possess content knowledge about several electrolysis processes but did not provide suitable explanations for the changes that had occurred, with less than 45 % of students displaying scientifically acceptable understandings about electrolysis. In addition, students displayed limited confidence about making the correct selections for the items; yet, in 16 of the 17 items, the percentage of students who were confident that they had selected the correct answer to an item was higher than the actual percentage of students who correctly answered the corresponding item. The findings suggest several implications for classroom instruction on the electrolysis topic that need to be addressed in order to facilitate better understanding by students of electrolysis concepts

    Beyond representing orthology relations by trees

    Get PDF
    Reconstructing the evolutionary past of a family of genes is an important aspect of many genomic studies. To help with this, simple relations on a set of sequences called orthology relations may be employed. In addition to being interesting from a practical point of view they are also attractive from a theoretical perspective in that e.\,g.\,a characterization is known for when such a relation is representable by a certain type of phylogenetic tree. For an orthology relation inferred from real biological data it is however generally too much to hope for that it satisfies that characterization. Rather than trying to correct the data in some way or another which has its own drawbacks, as an alternative, we propose to represent an orthology relation δ\delta in terms of a structure more general than a phylogenetic tree called a phylogenetic network. To compute such a network in the form of a level-1 representation for δ\delta, we formalize an orthology relation in terms of the novel concept of a symbolic 3- dissimilarity which is motivated by the biological concept of a ``cluster of orthologous groups'', or COG for short. For such maps which assign symbols rather that real values to elements, we introduce the novel {\sc Network-Popping} algorithm which has several attractive properties. In addition, we characterize an orthology relation δ\delta on some set XX that has a level-1 representation in terms of eight natural properties for δ\delta as well as in terms of level-1 representations of orthology relations on certain subsets of XX

    Using Interviews in CER Projects: Options, Considerations, and Limitations

    Get PDF
    Interviews can be a powerful chemistry education research tool. Different from an assessment score or Likert-scale survey number, interviews can provide the researcher with a way to examine and describe what we cannot see, aspects such as feelings, thoughts, or explanations of thinking or behavior. Most people have no doubt seen countless interviews on TV news and talk shows. These sessions might convey interviewing as a spontaneous, easy, and straightforward process. However, using interviews as a meaningful research tool requires considerable thought, preparation, and practice. This chapter provides a general introduction to the use of interviews as a tool within a chemistry education research context. The chapter provides a general introduction to the use of interviews as a research tool including how to plan, conduct, and analyze interviews. It highlights important considerations for designing and conducting fruitful interviews, provides examples of different ways in which interviews have been used effectively in chemistry education research, and supplies additional references for the reader who wants to delve more deeply into particular topics
    • …
    corecore