2,767 research outputs found

    An integrative approach for codon repeats evolutionary analyses

    Get PDF
    The relationship between genome characteristics and several human diseases has been a central research goal in genomics. Many studies have shown that specific gene patterns, such as amino acid repetitions, are associated with human diseases. However, several open questions still remain, such as, how these tandem repeats appeared in the evolutionary path or how they have evolved in orthologous genes of related organisms. In this paper, we present a computational solution that facilitates comparative studies of orthologous genes from various organisms. The application uses various web services to gather gene sequence information, local algorithms for tandem repeats identification and similarity measures for gene clustering.publishe

    The genome sequence of <i>Trypanosoma brucei gambiense</i>, causative agent of chronic Human African Trypanosomiasis

    Get PDF
    &lt;p&gt;&lt;b&gt;Background:&lt;/b&gt; &lt;i&gt;Trypanosoma brucei gambiense&lt;/i&gt; is the causative agent of chronic Human African Trypanosomiasis or sleeping sickness, a disease endemic across often poor and rural areas of Western and Central Africa. We have previously published the genome sequence of a &lt;i&gt;T. b. brucei&lt;/i&gt; isolate, and have now employed a comparative genomics approach to understand the scale of genomic variation between &lt;i&gt;T. b. gambiense&lt;/i&gt; and the reference genome. We sought to identify features that were uniquely associated with &lt;i&gt;T. b. gambiense&lt;/i&gt; and its ability to infect humans.&lt;/p&gt; &lt;p&gt;&lt;b&gt;Methods and findings:&lt;/b&gt; An improved high-quality draft genome sequence for the group 1 &lt;i&gt;T. b. gambiense&lt;/i&gt; DAL 972 isolate was produced using a whole-genome shotgun strategy. Comparison with &lt;i&gt;T. b. brucei&lt;/i&gt; showed that sequence identity averages 99.2% in coding regions, and gene order is largely collinear. However, variation associated with segmental duplications and tandem gene arrays suggests some reduction of functional repertoire in &lt;i&gt;T. b. gambiense&lt;/i&gt; DAL 972. A comparison of the variant surface glycoproteins (VSG) in &lt;i&gt;T. b. brucei&lt;/i&gt; with all &lt;i&gt;T. b. gambiense&lt;/i&gt; sequence reads showed that the essential structural repertoire of VSG domains is conserved across &lt;i&gt;T. brucei&lt;/i&gt;.&lt;/p&gt; &lt;p&gt;&lt;b&gt;Conclusions:&lt;/b&gt; This study provides the first estimate of intraspecific genomic variation within &lt;i&gt;T. brucei&lt;/i&gt;, and so has important consequences for future population genomics studies. We have shown that the &lt;i&gt;T. b. gambiense&lt;/i&gt; genome corresponds closely with the reference, which should therefore be an effective scaffold for any &lt;i&gt;T. brucei&lt;/i&gt; genome sequence data. As VSG repertoire is also well conserved, it may be feasible to describe the total diversity of variant antigens. While we describe several as yet uncharacterized gene families with predicted cell surface roles that were expanded in number in &lt;i&gt;T. b. brucei&lt;/i&gt;, no &lt;i&gt;T. b. gambiense&lt;/i&gt;-specific gene was identified outside of the subtelomeres that could explain the ability to infect humans.&lt;/p&gt

    Improving accuracy of gene prediction programs of the genemark family by means of genome segmentation

    Get PDF
    Issued as final reportNational Institutes of Health (U.S.

    The fate of Arabidopsis thaliana homeologous CNSs and their motifs in the Paleohexaploid Brassica rapa.

    Get PDF
    Following polyploidy, duplicate genes are often deleted, and if they are not, then duplicate regulatory regions are sometimes lost. By what mechanism is this loss and what is the chance that such a loss removes function? To explore these questions, we followed individual Arabidopsis thaliana-A. thaliana conserved noncoding sequences (CNSs) into the Brassica ancestor, through a paleohexaploidy and into Brassica rapa. Thus, a single Brassicaceae CNS has six potential orthologous positions in B. rapa; a single Arabidopsis CNS has three potential homeologous positions. We reasoned that a CNS, if present on a singlet Brassica gene, would be unlikely to lose function compared with a more redundant CNS, and this is the case. Redundant CNSs go nondetectable often. Using this logic, each mechanism of CNS loss was assigned a metric of functionality. By definition, proved deletions do not function as sequence. Our results indicated that CNSs that go nondetectable by base substitution or large insertion are almost certainly still functional (redundancy does not matter much to their detectability frequency), whereas those lost by inferred deletion or indels are approximately 75% likely to be nonfunctional. Overall, an average nondetectable, once-redundant CNS more than 30 bp in length has a 72% chance of being nonfunctional, and that makes sense because 97% of them sort to a molecular mechanism with deletion in its description, but base substitutions do cause loss. Similarly, proved-functional G-boxes go undetectable by deletion 82% of the time. Fractionation mutagenesis is a procedure that uses polyploidy as a mutagenic agent to genetically alter RNA expression profiles, and then to construct testable hypotheses as to the function of the lost regulatory site. We show fractionation mutagenesis to be a deletion machine in the Brassica lineage

    The repertoire of G protein-coupled receptors in the sea squirt Ciona intestinalis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>G protein-coupled receptors (GPCRs) constitute a large family of integral transmembrane receptor proteins that play a central role in signal transduction in eukaryotes. The genome of the protochordate <it>Ciona intestinalis </it>has a compact size with an ancestral complement of many diversified gene families of vertebrates and is a good model system for studying protochordate to vertebrate diversification. An analysis of the <it>Ciona </it>repertoire of GPCRs from a comparative genomic perspective provides insight into the evolutionary origins of the GPCR signalling system in vertebrates.</p> <p>Results</p> <p>We have identified 169 gene products in the <it>Ciona </it>genome that code for putative GPCRs. Phylogenetic analyses reveal that <it>Ciona </it>GPCRs have homologous representatives from the five major GRAFS (<it>Glutamate, Rhodopsin, Adhesion, Frizzled </it>and <it>Secretin</it>) families concomitant with other vertebrate GPCR repertoires. Nearly 39% of <it>Ciona </it>GPCRs have unambiguous orthologs of vertebrate GPCR families, as defined for the human, mouse, puffer fish and chicken genomes. The <it>Rhodopsin </it>family accounts for ~68% of the <it>Ciona </it>GPCR repertoire wherein the LGR-like subfamily exhibits a lineage specific gene expansion of a group of receptors that possess a novel domain organisation hitherto unobserved in metazoan genomes.</p> <p>Conclusion</p> <p>Comparison of GPCRs in <it>Ciona </it>to that in human reveals a high level of orthology of a protochordate repertoire with that of vertebrate GPCRs. Our studies suggest that the ascidians contain the basic ancestral complement of vertebrate GPCR genes. This is evident at the subfamily level comparisons since <it>Ciona </it>GPCR sequences are significantly analogous to vertebrate GPCR subfamilies even while exhibiting <it>Ciona </it>specific genes. Our analysis provides a framework to perform future experimental and comparative studies to understand the roles of the ancestral chordate versions of GPCRs that predated the divergence of the urochordates and the vertebrates.</p

    Sex-Linked Pheromone Receptor Genes of the European Corn Borer, Ostrinia nubilalis, Are in Tandem Arrays

    Get PDF
    BACKGROUND: Tuning of the olfactory system of male moths to conspecific female sex pheromones is crucial for correct species recognition; however, little is known about the genetic changes that drive speciation in this system. Moths of the genus Ostrinia are good models to elucidate this question, since significant differences in pheromone blends are observed within and among species. Odorant receptors (ORs) play a critical role in recognition of female sex pheromones; eight types of OR genes expressed in male antennae were previously reported in Ostrinia moths. METHODOLOGY/PRINCIPAL FINDINGS: We screened an O. nubilalis bacterial artificial chromosome (BAC) library by PCR, and constructed three contigs from isolated clones containing the reported OR genes. Fluorescence in situ hybridization (FISH) analysis using these clones as probes demonstrated that the largest contig, which contained eight OR genes, was located on the Z chromosome; two others harboring two and one OR genes were found on two autosomes. Sequence determination of BAC clones revealed the Z-linked OR genes were closely related and tandemly arrayed; moreover, four of them shared 181-bp direct repeats spanning exon 7 and intron 7. CONCLUSIONS/SIGNIFICANCE: This is the first report of tandemly arrayed sex pheromone receptor genes in Lepidoptera. The localization of an OR gene cluster on the Z chromosome agrees with previous findings for a Z-linked locus responsible for O. nubilalis male behavioral response to sex pheromone. The 181-bp direct repeats might enhance gene duplications by unequal crossovers. An autosomal locus responsible for male response to sex pheromone in Heliothis virescens and H. subflexa was recently reported to contain at least four OR genes. Taken together, these findings support the hypothesis that generation of additional copies of OR genes can increase the potential for male moths to acquire altered specificity for pheromone components, and accordingly, facilitate differentiation of sex pheromones

    SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand

    Get PDF
    The identification of conserved syntenic regions enables discovery of predicted locations for orthologous and homeologous genes, evenwhennosuchgeneispresent.Thiscapabilitymeansthatsynteny-basedmethodsarefarmoreeffectivethansequencesimilaritybased methods in identifying true-negatives, a necessity forstudying gene loss and gene transposition. However, the identification of syntenicregionsrequirescomplexanalyseswhichmustberepeatedforpairwisecomparisonsbetweenanytwospecies.Therefore,as the number of published genomes increases, there is a growing demand for scalable, simple-to-use applications to perform comparative genomic analyses that cater to both gene family studies and genome-scale studies. We implemented SynFind, a web-based tool that addresses this need. Given one query genome, SynFind is capable of identifying conserved syntenic regions in any set of targetgenomes.SynFindiscapableofreportingper-geneinformation,usefulforresearchersstudyingspecificgenefamilies,aswellas genome-wide data sets of syntenic gene and predicted gene locations, critical for researchers focused on large-scale genomic analyses. Inference of syntenic homologs provides the basis for correlation of functional changes around genes of interests between related organisms. Deployed on the CoGe online platform, SynFind is connected to the genomic data from over 15,000 organisms from all domains of life as well as supporting multiple releases of the same organism. SynFind makes use of a powerful job execution framework that promises scalability and reproducibility. SynFind can be accessed at http://genomevolution.org/CoGe/SynFind.pl. A video tutorial of SynFind using Phytophthrora as an example is available at http://www.youtube.com/watch?v=2Agczny9Nyc

    LINE drive: L1 element orthologous loci

    Get PDF
    The L1Hs preTa subfamily is one of the youngest L1 families. It originated after the divergence of human and chimpanzee about 2.34 mya, and therefore is only found in the human genome. Some elements were inserted so recently that they are not fixed in the population. Thirty three of the 254 L1Hs preTa elements are polymorphic for the absence/presence of the insertion, making them useful markers for studying phylogenetics and human population genetics. However, the problem of homoplasy can diminish the value of using L1 elements as phylogenetic and population genetic markers. Examination of the L1Hs preTa orthologous insertion sites in a range of non-human primates revealed an assortment of events that altered the size of the pre-integration or “empty” sites. Only two cases of parallel mobile element insertions into the same pre-integration sites were discovered, one involves an AluY in green monkey and the other a L1PA8 element in owl monkey. However, both elements were clearly distinguishable from their human counterparts. No preTa L1 element gene conversion events were observed in any of the loci analyzed. Therefore, we conclude that L1 elements are homoplasy-free genetic characters

    Nonsense-Mediated Decay Enables Intron Gain in Drosophila

    Get PDF
    Intron number varies considerably among genomes, but despite their fundamental importance, the mutational mechanisms and evolutionary processes underlying the expansion of intron number remain unknown. Here we show that Drosophila, in contrast to most eukaryotic lineages, is still undergoing a dramatic rate of intron gain. These novel introns carry significantly weaker splice sites that may impede their identification by the spliceosome. Novel introns are more likely to encode a premature termination codon (PTC), indicating that nonsense-mediated decay (NMD) functions as a backup for weak splicing of new introns. Our data suggest that new introns originate when genomic insertions with weak splice sites are hidden from selection by NMD. This mechanism reduces the sequence requirement imposed on novel introns and implies that the capacity of the spliceosome to recognize weak splice sites was a prerequisite for intron gain during eukaryotic evolution
    corecore