25 research outputs found

    Coevolution Network Models Predict the Impact of Multiple Mutations on Protein Function

    Get PDF
    Proteins often evolve new functions by acquiring a small number of mutations in an ancestral sequence not containing the phenotype. Modeling the functional effect of a mutation is, however, a nontrivial task, due to strong functional interdependencies. Here, I used the recent evolution of the bacterial enzyme TEM β-lactamase under antibiotic selection as a model for genetic adaptation. I compiled a database of TEM β-lactamase sequences evolved under antibiotic resistance selective pressure and identified functional interactions between individual mutations/mutated residues. I built network models of coevolving residues (possible functional interactions), in which nodes are mutations and edges represent coevolution between two mutations. I reconstructed both the alignment and phylogeny-based mutation coevolution networks and assessed the utility of network-theoretical tools to derive information regarding role of individual mutations in the observed resistance. Coevolution network} analysis reveals key properties of mutations in evolution of antibiotic resistance, many of which were confirmed through extensive fitness measurements in the lab and by previous experimental studies of TEM β-lactamase function. One finding is that mutations form densely connected clusters in the network corresponding to selection to different main classes of antibiotics or to different adaptive strategies within the same antibiotic class. Mutations that are central in the network tend to be either adaptive or compensate for effects of many other mutations. By extending node centrality metrics to paths of mutations (connected nodes in the network) I was able to study properties of adaptive evolutionary trajectories in TEM. I found that central paths are enriched in non-negative functional interactions. Specifically, paths corresponding to triple mutants were experimentally shown to increase fitness from all or most of their constitutive single and double mutants. It was also shown that relative rankings of central paths and their constituent shorter paths can be used to predict the direction of fitness change in an evolutionary trajectory. In this way, this predictor of the effect of an evolutionary trajectory can be useful in anticipating evolution of antibiotic resistance. In summary, my analysis of the combined functional effects of mutations in producing new biological activities should help anticipate evolution driven by a variety of clinically-relevant selections such as drug resistance, virulence, and immunity

    Genomic characterization of malignant progression in neoplastic pancreatic cysts

    Get PDF
    Intraductal papillary mucinous neoplasms (IPMNs) and mucinous cystic neoplasms (MCNs) are non-invasive neoplasms that are often observed in association with invasive pancreatic cancers, but their origins and evolutionary relationships are poorly understood. In this study, we analyze 148 samples from IPMNs, MCNs, and small associated invasive carcinomas from 18 patients using whole exome or targeted sequencing. Using evolutionary analyses, we establish that both IPMNs and MCNs are direct precursors to pancreatic cancer. Mutations in SMAD4 and TGFBR2 are frequently restricted to invasive carcinoma, while RNF43 alterations are largely in non-invasive lesions. Genomic analyses suggest an average window of over three years between the development of high-grade dysplasia and pancreatic cancer. Taken together, these data establish non-invasive IPMNs and MCNs as origins of invasive pancreatic cancer, identifying potential drivers of invasion, highlighting the complex clonal dynamics prior to malignant transformation, and providing opportunities for early detection and intervention

    Network Models of TEM β-Lactamase Mutations Coevolving under Antibiotic Selection Show Modular Structure and Anticipate Evolutionary Trajectories

    Get PDF
    Understanding how novel functions evolve (genetic adaptation) is a critical goal of evolutionary biology. Among asexual organisms, genetic adaptation involves multiple mutations that frequently interact in a non-linear fashion (epistasis). Non-linear interactions pose a formidable challenge for the computational prediction of mutation effects. Here we use the recent evolution of β-lactamase under antibiotic selection as a model for genetic adaptation. We build a network of coevolving residues (possible functional interactions), in which nodes are mutant residue positions and links represent two positions found mutated together in the same sequence. Most often these pairs occur in the setting of more complex mutants. Focusing on extended-spectrum resistant sequences, we use network-theoretical tools to identify triple mutant trajectories of likely special significance for adaptation. We extrapolate evolutionary paths (n = 3) that increase resistance and that are longer than the units used to build the network (n = 2). These paths consist of a limited number of residue positions and are enriched for known triple mutant combinations that increase cefotaxime resistance. We find that the pairs of residues used to build the network frequently decrease resistance compared to their corresponding singlets. This is a surprising result, given that their coevolution suggests a selective advantage. Thus, β-lactamase adaptation is highly epistatic. Our method can identify triplets that increase resistance despite the underlying rugged fitness landscape and has the unique ability to make predictions by placing each mutant residue position in its functional context. Our approach requires only sequence information, sufficient genetic diversity, and discrete selective pressures. Thus, it can be used to analyze recent evolutionary events, where coevolution analysis methods that use phylogeny or statistical coupling are not possible. Improving our ability to assess evolutionary trajectories will help predict the evolution of clinically relevant genes and aid in protein design

    SubClonal Hierarchy Inference from Somatic Mutations: Automatic Reconstruction of Cancer Evolutionary Trees from Multi-region Next Generation Sequencing

    No full text
    <div><p>Recent improvements in next-generation sequencing of tumor samples and the ability to identify somatic mutations at low allelic fractions have opened the way for new approaches to model the evolution of individual cancers. The power and utility of these models is increased when tumor samples from multiple sites are sequenced. Temporal ordering of the samples may provide insight into the etiology of both primary and metastatic lesions and rationalizations for tumor recurrence and therapeutic failures. Additional insights may be provided by temporal ordering of evolving subclones—cellular subpopulations with unique mutational profiles. Current methods for subclone hierarchy inference tightly couple the problem of temporal ordering with that of estimating the fraction of cancer cells harboring each mutation. We present a new framework that includes a rigorous statistical hypothesis test and a collection of tools that make it possible to decouple these problems, which we believe will enable substantial progress in the field of subclone hierarchy inference. The methods presented here can be flexibly combined with methods developed by others addressing either of these problems. We provide tools to interpret hypothesis test results, which inform phylogenetic tree construction, and we introduce the first genetic algorithm designed for this purpose. The utility of our framework is systematically demonstrated in simulations. For most tested combinations of tumor purity, sequencing coverage, and tree complexity, good power (≥ 0.8) can be achieved and Type 1 error is well controlled when at least three tumor samples are available from a patient. Using data from three published multi-region tumor sequencing studies of (murine) small cell lung cancer, acute myeloid leukemia, and chronic lymphocytic leukemia, in which the authors reconstructed subclonal phylogenetic trees by manual expert curation, we show how different configurations of our tools can identify either a single tree in agreement with the authors, or a small set of trees, which include the authors’ preferred tree. Our results have implications for improved modeling of tumor evolution and the importance of multi-region tumor sequencing.</p></div

    Overview of SCHISM framework.

    No full text
    <p>The framework decouples estimation of somatic mutation cellularities and reconstruction of subclone phylogenies. Given somatic mutation read counts from next generation sequencing data and somatic copy number calls if available, any tools for mutation cellularity estimation and mutation clustering can be applied. Their output is used to estimate the statistical support for temporal ordering of mutation or mutation cluster pairs, using a generalized likelihood ratio test (GLRT). Other approaches to tree reconstruction can be applied, by using the fitness function as the objective for optimization. GA = genetic algorithm, WGS = whole genome sequencing, WES = whole exome sequencing, DS = (targeted) deep sequencing. KDE = kernel density estimation. POV = precedence order violation.</p

    Crossover operation.

    No full text
    <p>A reproductive crossover operation involving a pair of parental trees is used to generate diversity among toplogies in members of each generation produced by the genetic algorithm.</p

    Performance of the genetic algorithm evaluated in two stages.

    No full text
    <p>Stage 1: Fraction of simulation runs where the genetic algorithm’s fitness function identified either a single maximum fitness tree (A1) or two maximum fitness trees (B1). Stage 2: Given success in stage 1, fraction of simulation runs where the correct tree was either the single maximum fitness tree (A2) or one of the top two maximum fitness trees (B2). For each combination of coverage and purity, results are shown for trees with node counts from three to eight. Simulations where (sample count) ≥ (node count) are marked by a double circle.</p
    corecore