7,193 research outputs found

    The evolutionary dynamics of variant antigen genes in Babesia reveal a history of genomic innovation underlying host-parasite interaction

    Get PDF
    Babesia spp. are tick-borne, intraerythrocytic hemoparasites that use antigenic variation to resist host immunity, through sequential modification of the parasite-derived variant erythrocyte surface antigen (VESA) expressed on the infected red blood cell surface. We identified the genomic processes driving antigenic diversity in genes encoding VESA (ves1) through comparative analysis within and between three Babesia species, (B. bigemina, B. divergens and B. bovis). Ves1 structure diverges rapidly after speciation, notably through the evolution of shortened forms (ves2) from 5′ ends of canonical ves1 genes. Phylogenetic analyses show that ves1 genes are transposed between loci routinely, whereas ves2 genes are not. Similarly, analysis of sequence mosaicism shows that recombination drives variation in ves1 sequences, but less so for ves2, indicating the adoption of different mechanisms for variation of the two families. Proteomic analysis of the B. bigemina PR isolate shows that two dominant VESA1 proteins are expressed in the population, whereas numerous VESA2 proteins are co-expressed, consistent with differential transcriptional regulation of each family. Hence, VESA2 proteins are abundant and previously unrecognized elements of Babesia biology, with evolutionary dynamics consistently different to those of VESA1, suggesting that their functions are distinct

    Regional perturbation of gene transcription is associated with intrachromosomal rearrangements and gene fusion transcripts in high grade ovarian cancer.

    Get PDF
    Genomic rearrangements are a hallmark of cancer biology and progression, allowing cells to rapidly transform through alterations in regulatory structures, changes in expression patterns, reprogramming of signaling pathways, and creation of novel transcripts via gene fusion events. Though functional gene fusions encoding oncogenic proteins are the most dramatic outcomes of genomic rearrangements, we investigated the relationship between rearrangements evidenced by fusion transcripts and local expression changes in cancer using transcriptome data alone. 9,953 gene fusion predictions from 418 primary serious ovarian cancer tumors were analyzed, identifying depletions of gene fusion breakpoints within coding regions of fused genes as well as an N-terminal enrichment of breakpoints within fused genes. We identified 48 genes with significant fusion-associated upregulation and furthermore demonstrate that significant regional overexpression of intact genes in patient transcriptomes occurs within 1 megabase of 78 novel gene fusions that function as central markers of these regions. We reveal that cancer transcriptomes select for gene fusions that preserve protein and protein domain coding potential. The association of gene fusion transcripts with neighboring gene overexpression supports rearrangements as mechanism through which cancer cells remodel their transcriptomes and identifies a new way to utilize gene fusions as indicators of regional expression changes in diseased cells with only transcriptomic data

    Structural evolution drives diversification of the large LRR-RLK gene family

    Get PDF
    Cells are continuously exposed to chemical signals that they must discriminate between and respond to appropriately. In embryophytes, the leucine‐rich repeat receptor‐like kinases (LRR‐RLKs) are signal receptors critical in development and defense. LRR‐RLKs have diversified to hundreds of genes in many plant genomes. Although intensively studied, a well‐resolved LRR‐RLK gene tree has remained elusive. To resolve the LRR‐RLK gene tree, we developed an improved gene discovery method based on iterative hidden Markov model searching and phylogenetic inference. We used this method to infer complete gene trees for each of the LRR‐RLK subclades and reconstructed the deepest nodes of the full gene family. We discovered that the LRR‐RLK gene family is even larger than previously thought, and that protein domain gains and losses are prevalent. These structural modifications, some of which likely predate embryophyte diversification, led to misclassification of some LRR‐RLK variants as members of other gene families. Our work corrects this misclassification. Our results reveal ongoing structural evolution generating novel LRR‐RLK genes. These new genes are raw material for the diversification of signaling in development and defense. Our methods also enable phylogenetic reconstruction in any large gene family

    Computational strategies for dissecting the high-dimensional complexity of adaptive immune repertoires

    Full text link
    The adaptive immune system recognizes antigens via an immense array of antigen-binding antibodies and T-cell receptors, the immune repertoire. The interrogation of immune repertoires is of high relevance for understanding the adaptive immune response in disease and infection (e.g., autoimmunity, cancer, HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the quantitative and molecular-level profiling of immune repertoires thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. Several methods for the computational and statistical analysis of large-scale AIRR-seq data have been developed to resolve immune repertoire complexity in order to understand the dynamics of adaptive immunity. Here, we review the current research on (i) diversity, (ii) clustering and network, (iii) phylogenetic and (iv) machine learning methods applied to dissect, quantify and compare the architecture, evolution, and specificity of immune repertoires. We summarize outstanding questions in computational immunology and propose future directions for systems immunology towards coupling AIRR-seq with the computational discovery of immunotherapeutics, vaccines, and immunodiagnostics.Comment: 27 pages, 2 figure

    Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans

    Get PDF
    We have used whole genome paired-end Illumina sequence data to identify tandem duplications in 20 isofemale lines of D. yakuba, and 20 isofemale lines of D. simulans and performed genome wide validation with PacBio long molecule sequencing. We identify 1,415 tandem duplications that are segregating in D. yakuba as well as 975 duplications in D. simulans, indicating greater variation in D. yakuba. Additionally, we observe high rates of secondary deletions at duplicated sites, with 8% of duplicated sites in D. simulans and 17% of sites in D. yakuba modified with deletions. These secondary deletions are consistent with the action of the large loop mismatch repair system acting to remove polymorphic tandem duplication, resulting in rapid dynamics of gain and loss in duplicated alleles and a richer substrate of genetic novelty than has been previously reported. Most duplications are present in only single strains, suggesting deleterious impacts are common. D. simulans shows larger numbers of whole gene duplications in comparison to larger proportions of gene fragments in D. yakuba. D. simulans displays an excess of high frequency variants on the X chromosome, consistent with adaptive evolution through duplications on the D. simulans X or demographic forces driving duplicates to high frequency. We identify 78 chimeric genes in D. yakuba and 38 chimeric genes in D. simulans, as well as 143 cases of recruited non-coding sequence in D. yakuba and 96 in D. simulans, in agreement with rates of chimeric gene origination in D. melanogaster. Together, these results suggest that tandem duplications often result in complex variation beyond whole gene duplications that offers a rich substrate of standing variation that is likely to contribute both to detrimental phenotypes and disease, as well as to adaptive evolutionary change.Comment: Revised Version- Accepted at Molecular Biology and Evolutio

    Genome-wide analyses of Liberibacter species provides insights into evolution, phylogenetic relationships, and virulence factors.

    Get PDF
    'Candidatus Liberibacter' species are insect-transmitted, phloem-limited α-Proteobacteria in the order of Rhizobiales. The citrus industry is facing significant challenges due to huanglongbing, associated with infection from 'Candidatus Liberibacter asiaticus' (Las). In order to gain greater insight into 'Ca. Liberibacter' biology and genetic diversity, we have performed genome sequencing and comparative analyses of diverse 'Ca. Liberibacter' species, including those that can infect citrus. Our phylogenetic analysis differentiates 'Ca. Liberibacter' species and Rhizobiales in separate clades and suggests stepwise evolution from a common ancestor splitting first into nonpathogenic Liberibacter crescens followed by diversification of pathogenic 'Ca. Liberibacter' species. Further analysis of Las genomes from different geographical locations revealed diversity among isolates from the United States. Our phylogenetic study also indicates multiple Las introduction events in California and spread of the pathogen from Florida to Texas. Texan Las isolates were closely related, while Florida and Asian isolates exhibited the most genetic variation. We have identified conserved Sec translocon (SEC)-dependent effectors likely involved in bacterial survival and virulence of Las and analysed their expression in their plant host (citrus) and insect vector (Diaphorina citri). Individual SEC-dependent effectors exhibited differential expression patterns between host and vector, indicating that Las uses its effector repertoire to differentially modulate diverse organisms. Collectively, this work provides insights into the evolution of 'Ca. Liberibacter' species, the introduction of Las in the United States and identifies promising Las targets for disease management

    The genome sequence of <i>Trypanosoma brucei gambiense</i>, causative agent of chronic Human African Trypanosomiasis

    Get PDF
    &lt;p&gt;&lt;b&gt;Background:&lt;/b&gt; &lt;i&gt;Trypanosoma brucei gambiense&lt;/i&gt; is the causative agent of chronic Human African Trypanosomiasis or sleeping sickness, a disease endemic across often poor and rural areas of Western and Central Africa. We have previously published the genome sequence of a &lt;i&gt;T. b. brucei&lt;/i&gt; isolate, and have now employed a comparative genomics approach to understand the scale of genomic variation between &lt;i&gt;T. b. gambiense&lt;/i&gt; and the reference genome. We sought to identify features that were uniquely associated with &lt;i&gt;T. b. gambiense&lt;/i&gt; and its ability to infect humans.&lt;/p&gt; &lt;p&gt;&lt;b&gt;Methods and findings:&lt;/b&gt; An improved high-quality draft genome sequence for the group 1 &lt;i&gt;T. b. gambiense&lt;/i&gt; DAL 972 isolate was produced using a whole-genome shotgun strategy. Comparison with &lt;i&gt;T. b. brucei&lt;/i&gt; showed that sequence identity averages 99.2% in coding regions, and gene order is largely collinear. However, variation associated with segmental duplications and tandem gene arrays suggests some reduction of functional repertoire in &lt;i&gt;T. b. gambiense&lt;/i&gt; DAL 972. A comparison of the variant surface glycoproteins (VSG) in &lt;i&gt;T. b. brucei&lt;/i&gt; with all &lt;i&gt;T. b. gambiense&lt;/i&gt; sequence reads showed that the essential structural repertoire of VSG domains is conserved across &lt;i&gt;T. brucei&lt;/i&gt;.&lt;/p&gt; &lt;p&gt;&lt;b&gt;Conclusions:&lt;/b&gt; This study provides the first estimate of intraspecific genomic variation within &lt;i&gt;T. brucei&lt;/i&gt;, and so has important consequences for future population genomics studies. We have shown that the &lt;i&gt;T. b. gambiense&lt;/i&gt; genome corresponds closely with the reference, which should therefore be an effective scaffold for any &lt;i&gt;T. brucei&lt;/i&gt; genome sequence data. As VSG repertoire is also well conserved, it may be feasible to describe the total diversity of variant antigens. While we describe several as yet uncharacterized gene families with predicted cell surface roles that were expanded in number in &lt;i&gt;T. b. brucei&lt;/i&gt;, no &lt;i&gt;T. b. gambiense&lt;/i&gt;-specific gene was identified outside of the subtelomeres that could explain the ability to infect humans.&lt;/p&gt

    Chlamydia pan-genomic analysis reveals balance between host adaptation and selective pressure to genome reduction

    Get PDF
    Background Chlamydia are ancient intracellular pathogens with reduced, though strikingly conserved genome. Despite their parasitic lifestyle and isolated intracellular environment, these bacteria managed to avoid accumulation of deleterious mutations leading to subsequent genome degradation characteristic for many parasitic bacteria. Results We report pan-genomic analysis of sixteen species from genus Chlamydia including identification and functional annotation of orthologous genes, and characterization of gene gains, losses, and rearrangements. We demonstrate the overall genome stability of these bacteria as indicated by a large fraction of common genes with conserved genomic locations. On the other hand, extreme evolvability is confined to several paralogous gene families such as polymorphic membrane proteins and phospholipase D, and likely is caused by the pressure from the host immune system. Conclusions This combination of a large, conserved core genome and a small, evolvable periphery likely reflect the balance between the selective pressure towards genome reduction and the need to adapt to escape from the host immunity

    Evolutionary and Molecular Analysis of Conserved Vertebrate Immunity to Fungi

    Get PDF
    The innate immune system is highly conserved amongst all multicellular organisms. Yet a constant battle exists between host cells and pathogens due to the rapid evolution of immune system components. Functional genomics and in silico methods can be employed to elucidate the evolutionary patterns of vertebrate immunity to pathogenic fungi such as Candida albicans, an opportunistic fungal pathogen that can cause lethal candidiasis in the immunocompromised. Mammals such as humans and mice possess conserved C-type lectin receptors that recognize the C. albicans cell wall. However, these receptors have not been identified in fish. Here I describe how we identified potential zebrafish fungal recognition receptors in silico to elucidate the evolution of vertebrate immunity to fungi and to integrate cost-effective zebrafish into candidiasis research. Phylogenetic and synteny analyses identified three potential receptors with conserved motifs for fungal recognition. Cell lines secreting soluble versions of these potential receptors were generated, and the proteins were purified. These receptors are currently being analyzed for their microbial recognition characteristics through immunofluorescence microscopy. Determining the specificity of these proteins may enhance our understanding of how the innate immune system evolved in lower and higher vertebrates. Furthermore, understanding such dynamics is an initial step toward developing novel anti-fungal therapeutics for commercially valuable fish and uncovering the fundamental mechanisms of immunity to fungi
    corecore