65 research outputs found

    Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments

    Get PDF
    EVidenceModeler (EVM) is an automated annotation tool that predicts protein-coding regions, alternatively spliced transcripts and untranslated regions of eukaryotic genes

    Regulation and Autoregulation of the Promoter for the Latency-associated Nuclear Antigen of Kaposi's Sarcoma-associated Herpesvirus

    Get PDF
    Kaposi's sarcoma-associated herpesvirus (KSHV) or human herpesvirus 8 has been established as the etiological agent of Kaposi's sarcoma and certain AIDS-associated lymphomas. KSHV establishes latent infection in these tumors, invariably expressing high levels of the viral latency-associated nuclear antigen (LANA) protein. LANA is necessary and sufficient to maintain the KSHV episome. It also modulates viral and cellular transcription and has been implicated directly in oncogenesis because of its ability to bind to the p53 and pRb tumor suppressor proteins. Previously, we identified the LANA promoter (LANAp) and showed that it was positively regulated by LANA itself. Here, we present a detailed mutational analysis and define cis-acting elements and trans-acting factors for the core LANAp. We found that a downstream promoter element, TATA box, and GC box/Sp1 site at -29 are all individually required for activity. This architecture places LANAp into the small and unusual group of eukaryotic promoters that contain both the downstream promoter element and TATA element but lack a defined initiation site. Furthermore, we demonstrate that LANA regulates its own promoter via its C-terminal domain and does bind to a defined site within the core promoter

    The TIGR Rice Genome Annotation Resource: improvements and new features

    Get PDF
    In The Institute for Genomic Research Rice Genome Annotation project (), we have continued to update the rice genome sequence with new data and improve the quality of the annotation. In our current release of annotation (Release 4.0; January 12, 2006), we have identified 42 653 non-transposable element-related genes encoding 49 472 gene models as a result of the detection of alternative splicing. We have refined our identification methods for transposable element-related genes resulting in 13 237 genes that are related to transposable elements. Through incorporation of multiple transcript and proteomic expression data sets, we have been able to annotate 24 799 genes (31 739 gene models), representing ∼50% of the total gene models, as expressed in the rice genome. All structural and functional annotation is viewable through our Rice Genome Browser which currently supports 59 tracks. Enhanced data access is available through web interfaces, FTP downloads and a Data Extractor tool developed in order to support discrete dataset downloads

    Re-annotation of the Theileria parva genome refines 53% of the proteome and uncovers essential components of N-glycosylation, a conserved pathway in many organisms

    Get PDF
    The apicomplexan parasite Theileria parva causes a livestock disease called East coast fever (ECF), with millions of animals at risk in sub-Saharan East and Southern Africa, the geographic distribution of T. parva. Over a million bovines die each year of ECF, with a tremendous economic burden to pastoralists in endemic countries. Comprehensive, accurate parasite genome annotation can facilitate the discovery of novel chemotherapeutic targets for disease treatment, as well as elucidate the biology of the parasite. However, genome annotation remains a significant challenge because of limitations in the quality and quantity of the data being used to inform the location and function of protein-coding genes and, when RNA data are used, the underlying biological complexity of the processes involved in gene expression. Here, we apply our recently published RNAseq dataset derived from the schizont life-cycle stage of T. parva to update structural and functional gene annotations across the entire nuclear genome.; The re-annotation effort lead to evidence-supported updates in over half of all protein-coding sequence (CDS) predictions, including exon changes, gene merges and gene splitting, an increase in average CDS length of approximately 50 base pairs, and the identification of 128 new genes. Among the new genes identified were those involved in N-glycosylation, a process previously thought not to exist in this organism and a potentially new chemotherapeutic target pathway for treating ECF. Alternatively-spliced genes were identified, and antisense and multi-gene family transcription were extensively characterized.; The process of re-annotation led to novel insights into the organization and expression profiles of protein-coding sequences in this parasite, and uncovered a minimal N-glycosylation pathway that changes our current understanding of the evolution of this post-translational modification in apicomplexan parasites

    The evolution of synaptic and cognitive capacity: insights from the nervous system transcriptome of Aplysia

    Get PDF
    © The Author(s), 2022. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Orvis, J., Albertin, C., Shrestha, P., Chen, S., Zheng, M., Rodriguez, C., Tallon, L., Mahurkar, A., Zimin, A., Kim, M., Liu, K., Kandel, E., Fraser, C., Sossin, W., & Abrams, T. The evolution of synaptic and cognitive capacity: insights from the nervous system transcriptome of Aplysia. Proceedings of the National Academy of Sciences of the United States of America, 119(28), (2022): e2122301119, https://doi.org/10.1073/pnas.2122301119.The gastropod mollusk Aplysia is an important model for cellular and molecular neurobiological studies, particularly for investigations of molecular mechanisms of learning and memory. We developed an optimized assembly pipeline to generate an improved Aplysia nervous system transcriptome. This improved transcriptome enabled us to explore the evolution of cognitive capacity at the molecular level. Were there evolutionary expansions of neuronal genes between this relatively simple gastropod Aplysia (20,000 neurons) and Octopus (500 million neurons), the invertebrate with the most elaborate neuronal circuitry and greatest behavioral complexity? Are the tremendous advances in cognitive power in vertebrates explained by expansion of the synaptic proteome that resulted from multiple rounds of whole genome duplication in this clade? Overall, the complement of genes linked to neuronal function is similar between Octopus and Aplysia. As expected, a number of synaptic scaffold proteins have more isoforms in humans than in Aplysia or Octopus. However, several scaffold families present in mollusks and other protostomes are absent in vertebrates, including the Fifes, Lev10s, SOLs, and a NETO family. Thus, whereas vertebrates have more scaffold isoforms from select families, invertebrates have additional scaffold protein families not found in vertebrates. This analysis provides insights into the evolution of the synaptic proteome. Both synaptic proteins and synaptic plasticity evolved gradually, yet the last deuterostome-protostome common ancestor already possessed an elaborate suite of genes associated with synaptic function, and critical for synaptic plasticity.This work was supported by NSF EAGER Award IOS-1255695 and NIH grant R01 MH 55880 grant to T.W.A.; by a Natural Sciences and Engineering Research Council of Canada Discovery grant and Canadian Institutes of Health Research project grant 340328 to W.S.; by funding from the HHMI to E.R.K.; and by a Hibbitt Early Career Fellowship to C.A. W.S. is James McGill Professor at McGill University

    Genome-wide diversity and gene expression profiling of Babesia microti isolates identify polymorphic genes that mediate host-pathogen interactions

    Get PDF
    Babesia microti, a tick-transmitted, intraerythrocytic protozoan parasite circulating mainly among small mammals, is the primary cause of human babesiosis. While most cases are transmitted by Ixodes ticks, the disease may also be transmitted through blood transfusion and perinatally. A comprehensive analysis of genome composition, genetic diversity, and gene expression profiling of seven B. microti isolates revealed that genetic variation in isolates from the Northeast United States is almost exclusively associated with genes encoding the surface proteome and secretome of the parasite. Furthermore, we found that polymorphism is restricted to a small number of genes, which are highly expressed during infection. In order to identify pathogen-encoded factors involved in host-parasite interactions, we screened a proteome array comprised of 174 B. microti proteins, including several predicted members of the parasite secretome. Using this immuno-proteomic approach we identified several novel antigens that trigger strong host immune responses during the onset of infection. The genomic and immunological data presented herein provide the first insights into the determinants of B. microti interaction with its mammalian hosts and their relevance for understanding the selective pressures acting on parasite evolution

    Capture-based enrichment of Theileria parva DNA enables full genome assembly of first buffalo-derived strain and reveals exceptional intra-specific genetic diversity

    Get PDF
    Theileria parva is an economically important, intracellular, tick-transmitted parasite of cattle. A live vaccine against the parasite is effective against challenge from cattle-transmissible T. parva but not against genotypes originating from the African Cape buffalo, a major wildlife reservoir, prompting the need to characterize genome-wide variation within and between cattle- and buffalo-associated T. parva populations. Here, we describe a capture-based target enrichment approach that enables, for the first time, de novo assembly of nearly complete T. parva genomes derived from infected host cell lines. This approach has exceptionally high specificity and sensitivity and is successful for both cattle- and buffalo-derived T. parva parasites. De novo genome assemblies generated for cattle genotypes differ from the reference by ~54K single nucleotide polymorphisms (SNPs) throughout the 8.31 Mb genome, an average of 6.5 SNPs/kb. We report the first buffalo-derived T. parva genome, which is ~20 kb larger than the genome from the reference, cattle-derived, Muguga strain, and contains 25 new potential genes. The average non-synonymous nucleotide diversity (πN) per gene, between buffalo-derived T. parva and the Muguga strain, was 1.3%. This remarkably high level of genetic divergence is supported by an average Wright’s fixation index (FST), genome-wide, of 0.44, reflecting a degree of genetic differentiation between cattle- and buffalo-derived T. parva parasites more commonly seen between, rather than within, species. These findings present clear implications for vaccine development, further demonstrated by the ability to assemble nearly all known antigens in the buffalo-derived strain, which will be critical in design of next generation vaccines. The DNA capture approach used provides a clear advantage in specificity over alternative T. parva DNA enrichment methods used previously, such as those that utilize schizont purification, is less labor intensive, and enables in-depth comparative genomics in this apicomplexan parasite
    • …
    corecore