108 research outputs found

    A prognostic model of all-cause mortality at 30 days in patients with cancer and COVID-19

    Get PDF
    Background: Patients with cancer are at higher risk of dying of COVID-19. Known risk factors for 30-day all-cause mortality (ACM-30) in patients with cancer are older age, sex, smoking status, performance status, obesity, and co-morbidities. We hypothesized that common clinical and laboratory parameters would be predictive of a higher risk of 30-day ACM, and that a machine learning approach (random forest) could produce high accuracy. Methods: In this multi-institutional COVID-19 and Cancer Consortium (CCC19) registry study, 12,661 patients enrolled between March 17, 2020 and December 31, 2021 were utilized to develop and validate a model of ACM-30. ACM-30 was defined as death from any cause within 30 days of COVID-19 diagnosis. Pre-specified variables were: age, sex, race, smoking status, ECOG performance status (PS), timing of cancer treatment relative to COVID19 diagnosis, severity of COVID19, type of cancer, and other laboratory measurements. Missing variables were imputed using random forest proximity. Random forest was utilized to model ACM-30. The area under the curve (AUC) was computed as a measure of predictive accuracy with out-of-bag prediction. One hundred bootstrapped samples were used to obtain the standard error of the AUC. Results: The median age at COVID-19 diagnosis was 65 years, 53% were female, 18% were Hispanic, and 16.7% were Black. Over half were never smokers and the median body mass index was 28.2. Random forest with under sampling selected 20 factors prognostic of ACM-30. The AUC was 88.9 (95% CI 88.5-89.2). Highly informative parameters included: COVID-19 severity at presentation, cancer status, age, troponin level, ECOG PS and body mass index. Conclusions: This prognostic model based on readily available clinical and laboratory values can be used to estimate individual survival probability within 30-days for COVID-19. In addition, this model can be used to select or classify patients with cancer and COVID-19 into risk groups based on validated cut points, for treatment selection, prophylaxis prioritization, and/or enrollment in clinical trials. Future work includes external validation using other large datasets of patients with COVID-19 and cancer

    Networks link antigenic and receptor-binding sites of influenza hemagglutinin: Mechanistic insight into fitter strain propagation

    Get PDF
    Influenza viral passaging through pre-vaccinated mice shows that emergent antigenic site mutations on the viral hemagglutinin (HA) impact host receptor-binding affinity and, therefore, the evolution of fitter influenza strains. To understand this phenomenon, we computed the Significant Interactions Network (SIN) for each residue and mapped the networks of antigenic site residues on a representative H1N1 HA. Specific antigenic site residues are β€˜linked’ to receptor-binding site (RBS) residues via their SIN and mutations within β€œRBS-linked” antigenic residues can significantly influence receptor-binding affinity by impacting the SIN of key RBS residues. In contrast, other antigenic site residues do not have such β€œRBS-links” and do not impact receptor-binding affinity upon mutation. Thus, a potential mechanism emerges for how immunologic pressure on RBS-linked antigenic residues can contribute to evolution of fitter influenza strains by modulating the host receptor-binding affinity

    Validation of Coevolving Residue Algorithms via Pipeline Sensitivity Analysis: ELSC and OMES and ZNMI, Oh My!

    Get PDF
    Correlated amino acid substitution algorithms attempt to discover groups of residues that co-fluctuate due to either structural or functional constraints. Although these algorithms could inform both ab initio protein folding calculations and evolutionary studies, their utility for these purposes has been hindered by a lack of confidence in their predictions due to hard to control sources of error. To complicate matters further, naive users are confronted with a multitude of methods to choose from, in addition to the mechanics of assembling and pruning a dataset. We first introduce a new pair scoring method, called ZNMI (Z-scored-product Normalized Mutual Information), which drastically improves the performance of mutual information for co-fluctuating residue prediction. Second and more important, we recast the process of finding coevolving residues in proteins as a data-processing pipeline inspired by the medical imaging literature. We construct an ensemble of alignment partitions that can be used in a cross-validation scheme to assess the effects of choices made during the procedure on the resulting predictions. This pipeline sensitivity study gives a measure of reproducibility (how similar are the predictions given perturbations to the pipeline?) and accuracy (are residue pairs with large couplings on average close in tertiary structure?). We choose a handful of published methods, along with ZNMI, and compare their reproducibility and accuracy on three diverse protein families. We find that (i) of the algorithms tested, while none appear to be both highly reproducible and accurate, ZNMI is one of the most accurate by far and (ii) while users should be wary of predictions drawn from a single alignment, considering an ensemble of sub-alignments can help to determine both highly accurate and reproducible couplings. Our cross-validation approach should be of interest both to developers and end users of algorithms that try to detect correlated amino acid substitutions

    Inference of Co-Evolving Site Pairs: an Excellent Predictor of Contact Residue Pairs in Protein 3D structures

    Get PDF
    Residue-residue interactions that fold a protein into a unique three-dimensional structure and make it play a specific function impose structural and functional constraints on each residue site. Selective constraints on residue sites are recorded in amino acid orders in homologous sequences and also in the evolutionary trace of amino acid substitutions. A challenge is to extract direct dependences between residue sites by removing indirect dependences through other residues within a protein or even through other molecules. Recent attempts of disentangling direct from indirect dependences of amino acid types between residue positions in multiple sequence alignments have revealed that the strength of inferred residue pair couplings is an excellent predictor of residue-residue proximity in folded structures. Here, we report an alternative attempt of inferring co-evolving site pairs from concurrent and compensatory substitutions between sites in each branch of a phylogenetic tree. First, branch lengths of a phylogenetic tree inferred by the neighbor-joining method are optimized as well as other parameters by maximizing a likelihood of the tree in a mechanistic codon substitution model. Mean changes of quantities, which are characteristic of concurrent and compensatory substitutions, accompanied by substitutions at each site in each branch of the tree are estimated with the likelihood of each substitution. Partial correlation coefficients of the characteristic changes along branches between sites are calculated and used to rank co-evolving site pairs. Accuracy of contact prediction based on the present co-evolution score is comparable to that achieved by a maximum entropy model of protein sequences for 15 protein families taken from the Pfam release 26.0. Besides, this excellent accuracy indicates that compensatory substitutions are significant in protein evolution.Comment: 17 pages, 4 figures, and 4 tables with supplementary information of 5 figure

    Bandwidth is Political: Reachability in the Public Internet

    Full text link

    Protein 3D Structure Computed from Evolutionary Sequence Variation

    Get PDF
    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing

    ePlant and the 3D Data Display Initiative: Integrative Systems Biology on the World Wide Web

    Get PDF
    Visualization tools for biological data are often limited in their ability to interactively integrate data at multiple scales. These computational tools are also typically limited by two-dimensional displays and programmatic implementations that require separate configurations for each of the user's computing devices and recompilation for functional expansion. Towards overcoming these limitations we have developed β€œePlant” (http://bar.utoronto.ca/eplant) – a suite of open-source world wide web-based tools for the visualization of large-scale data sets from the model organism Arabidopsis thaliana. These tools display data spanning multiple biological scales on interactive three-dimensional models. Currently, ePlant consists of the following modules: a sequence conservation explorer that includes homology relationships and single nucleotide polymorphism data, a protein structure model explorer, a molecular interaction network explorer, a gene product subcellular localization explorer, and a gene expression pattern explorer. The ePlant's protein structure explorer module represents experimentally determined and theoretical structures covering >70% of the Arabidopsis proteome. The ePlant framework is accessed entirely through a web browser, and is therefore platform-independent. It can be applied to any model organism. To facilitate the development of three-dimensional displays of biological data on the world wide web we have established the β€œ3D Data Display Initiative” (http://3ddi.org)

    Integrated Analysis of Residue Coevolution and Protein Structure in ABC Transporters

    Get PDF
    Intraprotein side chain contacts can couple the evolutionary process of amino acid substitution at one position to that at another. This coupling, known as residue coevolution, may vary in strength. Conserved contacts thus not only define 3-dimensional protein structure, but also indicate which residue-residue interactions are crucial to a protein’s function. Therefore, prediction of strongly coevolving residue-pairs helps clarify molecular mechanisms underlying function. Previously, various coevolution detectors have been employed separately to predict these pairs purely from multiple sequence alignments, while disregarding available structural information. This study introduces an integrative framework that improves the accuracy of such predictions, relative to previous approaches, by combining multiple coevolution detectors and incorporating structural contact information. This framework is applied to the ABC-B and ABC-C transporter families, which include the drug exporter P-glycoprotein involved in multidrug resistance of cancer cells, as well as the CFTR chloride channel linked to cystic fibrosis disease. The predicted coevolving pairs are further analyzed based on conformational changes inferred from outward- and inward-facing transporter structures. The analysis suggests that some pairs coevolved to directly regulate conformational changes of the alternating-access transport mechanism, while others to stabilize rigid-body-like components of the protein structure. Moreover, some identified pairs correspond to residues previously implicated in cystic fibrosis

    Role of Hsp70 ATPase Domain Intrinsic Dynamics and Sequence Evolution in Enabling its Functional Interactions with NEFs

    Get PDF
    Catalysis of ADP-ATP exchange by nucleotide exchange factors (NEFs) is central to the activity of Hsp70 molecular chaperones. Yet, the mechanism of interaction of this family of chaperones with NEFs is not well understood in the context of the sequence evolution and structural dynamics of Hsp70 ATPase domains. We studied the interactions of Hsp70 ATPase domains with four different NEFs on the basis of the evolutionary trace and co-evolution of the ATPase domain sequence, combined with elastic network modeling of the collective dynamics of the complexes. Our study reveals a subtle balance between the intrinsic (to the ATPase domain) and specific (to interactions with NEFs) mechanisms shared by the four complexes. Two classes of key residues are distinguished in the Hsp70 ATPase domain: (i) highly conserved residues, involved in nucleotide binding, which mediate, via a global hinge-bending, the ATPase domain opening irrespective of NEF binding, and (ii) not-conserved but co-evolved and highly mobile residues, engaged in specific interactions with NEFs (e.g., N57, R258, R262, E283, D285). The observed interplay between these respective intrinsic (pre-existing, structure-encoded) and specific (co-evolved, sequence-dependent) interactions provides us with insights into the allosteric dynamics and functional evolution of the modular Hsp70 ATPase domain

    The Contribution of Coevolving Residues to the Stability of KDO8P Synthase

    Get PDF
    The evolutionary tree of 3-deoxy-D-manno-octulosonate 8-phosphate (KDO8P) synthase (KDO8PS), a bacterial enzyme that catalyzes a key step in the biosynthesis of bacterial endotoxin, is evenly divided between metal and non-metal forms, both having similar structures, but diverging in various degrees in amino acid sequence. Mutagenesis, crystallographic and computational studies have established that only a few residues determine whether or not KDO8PS requires a metal for function. The remaining divergence in the amino acid sequence of KDO8PSs is apparently unrelated to the underlying catalytic mechanism.The multiple alignment of all known KDO8PS sequences reveals that several residue pairs coevolved, an indication of their possible linkage to a structural constraint. In this study we investigated by computational means the contribution of coevolving residues to the stability of KDO8PS. We found that about 1/4 of all strongly coevolving pairs probably originated from cycles of mutation (decreasing stability) and suppression (restoring it), while the remaining pairs are best explained by a succession of neutral or nearly neutral covarions.Both sequence conservation and coevolution are involved in the preservation of the core structure of KDO8PS, but the contribution of coevolving residues is, in proportion, smaller. This is because small stability gains or losses associated with selection of certain residues in some regions of the stability landscape of KDO8PS are easily offset by a large number of possible changes in other regions. While this effect increases the tolerance of KDO8PS to deleterious mutations, it also decreases the probability that specific pairs of residues could have a strong contribution to the thermodynamic stability of the protein
    • …
    corecore