45 research outputs found

    Of Bits and Bugs — On the Use of Bioinformatics and a Bacterial Crystal Structure to Solve a Eukaryotic Repeat-Protein Structure

    Get PDF
    Pur-α is a nucleic acid-binding protein involved in cell cycle control, transcription, and neuronal function. Initially no prediction of the three-dimensional structure of Pur-α was possible. However, recently we solved the X-ray structure of Pur-α from the fruitfly Drosophila melanogaster and showed that it contains a so-called PUR domain. Here we explain how we exploited bioinformatics tools in combination with X-ray structure determination of a bacterial homolog to obtain diffracting crystals and the high-resolution structure of Drosophila Pur-α. First, we used sensitive methods for remote-homology detection to find three repetitive regions in Pur-α. We realized that our lack of understanding how these repeats interact to form a globular domain was a major problem for crystallization and structure determination. With our information on the repeat motifs we then identified a distant bacterial homolog that contains only one repeat. We determined the bacterial crystal structure and found that two of the repeats interact to form a globular domain. Based on this bacterial structure, we calculated a computational model of the eukaryotic protein. The model allowed us to design a crystallizable fragment and to determine the structure of Drosophila Pur-α. Key for success was the fact that single repeats of the bacterial protein self-assembled into a globular domain, instructing us on the number and boundaries of repeats to be included for crystallization trials with the eukaryotic protein. This study demonstrates that the simpler structural domain arrangement of a distant prokaryotic protein can guide the design of eukaryotic crystallization constructs. Since many eukaryotic proteins contain multiple repeats or repeating domains, this approach might be instructive for structural studies of a range of proteins

    Discovery of Ongoing Selective Sweeps within Anopheles Mosquito Populations Using Deep Learning

    Get PDF
    Identification of partial sweeps, which include both hard and soft sweeps that have not currently reached fixation, provides crucial information about ongoing evolutionary responses. To this end, we introduce partialS/HIC, a deep learning method to discover selective sweeps from population genomic data. partialS/HIC uses a convolutional neural network for image processing, which is trained with a large suite of summary statistics derived from coalescent simulations incorporating population-specific history, to distinguish between completed versus partial sweeps, hard versus soft sweeps, and regions directly affected by selection versus those merely linked to nearby selective sweeps. We perform several simulation experiments under various demographic scenarios to demonstrate partialS/HIC's performance, which exhibits excellent resolution for detecting partial sweeps. We also apply our classifier to whole genomes from eight mosquito populations sampled across sub-Saharan Africa by the Anopheles gambiae 1000 Genomes Consortium, elucidating both continent-wide patterns as well as sweeps unique to specific geographic regions. These populations have experienced intense insecticide exposure over the past two decades, and we observe a strong overrepresentation of sweeps at insecticide resistance loci. Our analysis thus provides a list of candidate adaptive loci that may be relevant to mosquito control efforts. More broadly, our supervised machine learning approach introduces a method to distinguish between completed and partial sweeps, as well as between hard and soft sweeps, under a variety of demographic scenarios. As whole-genome data rapidly accumulate for a greater diversity of organisms, partialS/HIC addresses an increasing demand for useful selection scan tools that can track in-progress evolutionary dynamics

    Absolute risk and predictors of the growth of acute spontaneous intracerebral haemorrhage: a systematic review and meta-analysis of individual patient data.

    Get PDF
    Background Intracerebral haemorrhage growth is associated with poor clinical outcome and is a therapeutic target for improving outcome. We aimed to determine the absolute risk and predictors of intracerebral haemorrhage growth, develop and validate prediction models, and evaluate the added value of CT angiography. Methods In a systematic review of OVID MEDLINE—with additional hand-searching of relevant studies' bibliographies— from Jan 1, 1970, to Dec 31, 2015, we identified observational cohorts and randomised trials with repeat scanning protocols that included at least ten patients with acute intracerebral haemorrhage. We sought individual patient-level data from corresponding authors for patients aged 18 years or older with data available from brain imaging initially done 0·5–24 h and repeated fewer than 6 days after symptom onset, who had baseline intracerebral haemorrhage volume of less than 150 mL, and did not undergo acute treatment that might reduce intracerebral haemorrhage volume. We estimated the absolute risk and predictors of the primary outcome of intracerebral haemorrhage growth (defined as >6 mL increase in intracerebral haemorrhage volume on repeat imaging) using multivariable logistic regression models in development and validation cohorts in four subgroups of patients, using a hierarchical approach: patients not taking anticoagulant therapy at intracerebral haemorrhage onset (who constituted the largest subgroup), patients taking anticoagulant therapy at intracerebral haemorrhage onset, patients from cohorts that included at least some patients taking anticoagulant therapy at intracerebral haemorrhage onset, and patients for whom both information about anticoagulant therapy at intracerebral haemorrhage onset and spot sign on acute CT angiography were known. Findings Of 4191 studies identified, 77 were eligible for inclusion. Overall, 36 (47%) cohorts provided data on 5435 eligible patients. 5076 of these patients were not taking anticoagulant therapy at symptom onset (median age 67 years, IQR 56–76), of whom 1009 (20%) had intracerebral haemorrhage growth. Multivariable models of patients with data on antiplatelet therapy use, data on anticoagulant therapy use, and assessment of CT angiography spot sign at symptom onset showed that time from symptom onset to baseline imaging (odds ratio 0·50, 95% CI 0·36–0·70; p<0·0001), intracerebral haemorrhage volume on baseline imaging (7·18, 4·46–11·60; p<0·0001), antiplatelet use (1·68, 1·06–2·66; p=0·026), and anticoagulant use (3·48, 1·96–6·16; p<0·0001) were independent predictors of intracerebral haemorrhage growth (C-index 0·78, 95% CI 0·75–0·82). Addition of CT angiography spot sign (odds ratio 4·46, 95% CI 2·95–6·75; p<0·0001) to the model increased the C-index by 0·05 (95% CI 0·03–0·07). Interpretation In this large patient-level meta-analysis, models using four or five predictors had acceptable to good discrimination. These models could inform the location and frequency of observations on patients in clinical practice, explain treatment effects in prior randomised trials, and guide the design of future trials

    pabu_Evol2017poster.ai

    No full text
    Evolution 2017 poste

    The roads from phenotypic variation to gene discovery: mutagenesis versus QTLs.

    No full text
    In model organisms, chemical mutagenesis provides a powerful alternative to natural, polygenic variation (for example, quantitative trait loci (QTLs)) for identifying functional pathways and complex disease genes. Despite recent progress in QTLs, we expect that mutagenesis is will ultimately prove more effective because the prospects of gene identification are high and every gene affecting a trait is potentially a target

    Discovery of Ongoing Selective Sweeps within Anopheles Mosquito Populations Using Deep Learning

    Get PDF
    International audienceAbstract Identification of partial sweeps, which include both hard and soft sweeps that have not currently reached fixation, provides crucial information about ongoing evolutionary responses. To this end, we introduce partialS/HIC, a deep learning method to discover selective sweeps from population genomic data. partialS/HIC uses a convolutional neural network for image processing, which is trained with a large suite of summary statistics derived from coalescent simulations incorporating population-specific history, to distinguish between completed versus partial sweeps, hard versus soft sweeps, and regions directly affected by selection versus those merely linked to nearby selective sweeps. We perform several simulation experiments under various demographic scenarios to demonstrate partialS/HIC’s performance, which exhibits excellent resolution for detecting partial sweeps. We also apply our classifier to whole genomes from eight mosquito populations sampled across sub-Saharan Africa by the Anopheles gambiae 1000 Genomes Consortium, elucidating both continent-wide patterns as well as sweeps unique to specific geographic regions. These populations have experienced intense insecticide exposure over the past two decades, and we observe a strong overrepresentation of sweeps at insecticide resistance loci. Our analysis thus provides a list of candidate adaptive loci that may be relevant to mosquito control efforts. More broadly, our supervised machine learning approach introduces a method to distinguish between completed and partial sweeps, as well as between hard and soft sweeps, under a variety of demographic scenarios. As whole-genome data rapidly accumulate for a greater diversity of organisms, partialS/HIC addresses an increasing demand for useful selection scan tools that can track in-progress evolutionary dynamics
    corecore