86 research outputs found

    Improved gene tree error correction in the presence of horizontal gene transfer

    Get PDF
    Motivation: The accurate inference of gene trees is a necessary step in many evolutionary studies. Although the problem of accurate gene tree inference has received considerable attention, most existing methods are only applicable to gene families unaffected by horizontal gene transfer. As a result, the accurate inference of gene trees affected by horizontal gene transfer remains a largely unaddressed problem. Results: In this study, we introduce a new and highly effective method for gene tree error correction in the presence of horizontal gene transfer. Our method efficiently models horizontal gene transfers, gene duplications and losses, and uses a statistical hypothesis testing framework [Shimodaira–Hasegawa (SH) test] to balance sequence likelihood with topological information from a known species tree. Using a thorough simulation study, we show that existing phylogenetic methods yield inaccurate gene trees when applied to horizontally transferred gene families and that our method dramatically improves gene tree accuracy. We apply our method to a dataset of 11 cyanobacterial species and demonstrate the large impact of gene tree accuracy on downstream evolutionary analyses. Availability and implementation: An implementation of our method is available at http://compbio.mit.edu/treefix-dtl/National Science Foundation (U.S.) (CAREER Award 0644282)National Institutes of Health (U.S.) (RC2 HG005639)National Science Foundation (U.S.). Assembling the Tree of Life (Program) (0936234)University of Connecticu

    Auto Calibration and Optimization of Large-Scale Water Resources Systems

    Get PDF
    Water resource systems modelling have constantly been a challenge through history for human being. As the innovative methodological development is evolving alongside computer sciences on one hand, researches are likely to confront more complex and larger water resources systems due to new challenges regarding increased water demands, climate change and human interventions, socio-economic concerns, and environment protection and sustainability. In this research, an automatic calibration scheme has been applied on the Gilan's large-scale water resource model using mathematical programming. The water resource model's calibration is developed in order to attune unknown water return flows from demand sites in the complex Sefidroud irrigation network and other related areas. The calibration procedure is validated by comparing several gauged river outflows from the system in the past with model results. The calibration results are pleasantly reasonable presenting a rational insight of the system. Subsequently, the unknown optimized parameters were used in a basin-scale linear optimization model with the ability to evaluate the system's performance against a reduced inflow scenario in future. Results showed an acceptable match between predicted and observed outflows from the system at selected hydrometric stations. Moreover, an efficient operating policy was determined for Sefidroud dam leading to a minimum water shortage in the reduced inflow scenario

    Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss

    Get PDF
    Motivation: Gene family evolution is driven by evolutionary events such as speciation, gene duplication, horizontal gene transfer and gene loss, and inferring these events in the evolutionary history of a given gene family is a fundamental problem in comparative and evolutionary genomics with numerous important applications. Solving this problem requires the use of a reconciliation framework, where the input consists of a gene family phylogeny and the corresponding species phylogeny, and the goal is to reconcile the two by postulating speciation, gene duplication, horizontal gene transfer and gene loss events. This reconciliation problem is referred to as duplication-transfer-loss (DTL) reconciliation and has been extensively studied in the literature. Yet, even the fastest existing algorithms for DTL reconciliation are too slow for reconciling large gene families and for use in more sophisticated applications such as gene tree or species tree reconstruction

    Beyond representing orthology relations by trees

    Get PDF
    Reconstructing the evolutionary past of a family of genes is an important aspect of many genomic studies. To help with this, simple relations on a set of sequences called orthology relations may be employed. In addition to being interesting from a practical point of view they are also attractive from a theoretical perspective in that e.\,g.\,a characterization is known for when such a relation is representable by a certain type of phylogenetic tree. For an orthology relation inferred from real biological data it is however generally too much to hope for that it satisfies that characterization. Rather than trying to correct the data in some way or another which has its own drawbacks, as an alternative, we propose to represent an orthology relation δ\delta in terms of a structure more general than a phylogenetic tree called a phylogenetic network. To compute such a network in the form of a level-1 representation for δ\delta, we formalize an orthology relation in terms of the novel concept of a symbolic 3- dissimilarity which is motivated by the biological concept of a ``cluster of orthologous groups'', or COG for short. For such maps which assign symbols rather that real values to elements, we introduce the novel {\sc Network-Popping} algorithm which has several attractive properties. In addition, we characterize an orthology relation δ\delta on some set XX that has a level-1 representation in terms of eight natural properties for δ\delta as well as in terms of level-1 representations of orthology relations on certain subsets of XX

    Evolution through segmental duplications and losses : A Super-Reconciliation approach

    Get PDF
    The classical gene and species tree reconciliation, used to infer the history of gene gain and loss explaining the evolution of gene families, assumes an independent evolution for each family. While this assumption is reasonable for genes that are far apart in the genome, it is not appropriate for genes grouped into syntenic blocks, which are more plausibly the result of a concerted evolution. Here, we introduce the Super-Reconciliation problem which consists in inferring a history of segmental duplication and loss events (involving a set of neighboring genes) leading to a set of present-day syntenies from a single ancestral one. In other words, we extend the traditional Duplication-Loss reconciliation problem of a single gene tree, to a set of trees, accounting for segmental duplications and losses. Existency of a Super-Reconciliation depends on individual gene tree consistency. In addition, ignoring rearrangements implies that existency also depends on gene order consistency. We first show that the problem of reconstructing a most parsimonious Super-Reconciliation, if any, is NP-hard and give an exact exponential-time algorithm to solve it. Alternatively, we show that accounting for rearrangements in the evolutionary model, but still only minimizing segmental duplication and loss events, leads to an exact polynomial-time algorithm. We finally assess time efficiency of the former exponential time algorithm for the Duplication-Loss model on simulated datasets, and give a proof of concept on the opioid receptor genes

    Predictors of Chemosensitivity in Triple Negative Breast Cancer: An Integrated Genomic Analysis

    Get PDF
    Background: Triple negative breast cancer (TNBC) is a highly heterogeneous and aggressive disease, and although no effective targeted therapies are available to date, about one-third of patients with TNBC achieve pathologic complete response (pCR) from standard-of-care anthracycline/taxane (ACT) chemotherapy. The heterogeneity of these tumors, however, has hindered the discovery of effective biomarkers to identify such patients. Methods and Findings: We performed whole exome sequencing on 29 TNBC cases from the MD Anderson Cancer Center (MDACC) selected because they had either pCR (n = 18) or extensive residual disease (n = 11) after neoadjuvant chemotherapy, with cases from The Cancer Genome Atlas (TCGA; n = 144) and METABRIC (n = 278) cohorts serving as validation cohorts. Our analysis revealed that mutations in the AR- and FOXA1-regulated networks, in which BRCA1 plays a key role, are associated with significantly higher sensitivity to ACT chemotherapy in the MDACC cohort (pCR rate of 94.1% compared to 16.6% in tumors without mutations in AR/FOXA1 pathway, adjusted p = 0.02) and significantly better survival outcome in the TCGA TNBC cohort (log-rank test, p = 0.05). Combined analysis of DNA sequencing, DNA methylation, and RNA sequencing identified tumors of a distinct BRCA-deficient (BRCA-D) TNBC subtype characterized by low levels of wild-type BRCA1/2 expression. Patients with functionally BRCA-D tumors had significantly better survival with standard-of-care chemotherapy than patients whose tumors were not BRCA-D (log-rank test, p = 0.021), and they had significantly higher mutation burden (p < 0.001) and presented clonal neoantigens that were associated with increased immune cell activity. A transcriptional signature of BRCA-D TNBC tumors was independently validated to be significantly associated with improved survival in the METABRIC dataset (log-rank test, p = 0.009). As a retrospective study, limitations include the small size and potential selection bias in the discovery cohort. Conclusions: The comprehensive molecular analysis presented in this study directly links BRCA deficiency with increased clonal mutation burden and significantly enhanced chemosensitivity in TNBC and suggests that functional RNA-based BRCA deficiency needs to be further examined in TNBC. © 2016 Jiang et al
    corecore