57 research outputs found

    Ab initio gene identification: prokaryote genome annotation with GeneScan and GLIMMER

    Get PDF
    We compare the annotation of three complete genomes using theab initio methods of gene identification GeneScan and GLIMMER. The annotation given in GenBank, the standard against which these are compared, has been made using GeneMark. We find a number of novel genes which are predicted by both methods used here, as well as a number of genes that are predicted by GeneMark, but are not identified by either of the nonconsensus methods that we have used. The three organisms studied here are all prokaryotic species with fairly compact genomes. The Fourier measure forms the basis for an efficient non-consensus method for gene prediction, and the algorithm GeneScan exploits this measure. We have bench-marked this program as well as GLIMMER using 3 complete prokaryotic genomes. An effort has also been made to study the limitations of these techniques for complete genome analysis. GeneScan and GLIMMER are of comparable accuracy insofar as gene-identification is concerned, with sensitivities and specificities typically greater than 0.9. The number of false predictions (both positive and negative) is higher for GeneScan as compared to GLIMMER, but in a significant number of cases, similar results are provided by the two techniques. This suggests that there could be some as-yet unidentified additional genes in these three genomes, and also that some of the putative identifications made hitherto might require re-evaluation. All these cases are discussed in detail

    Genetic mapping of paternal sorting of mitochondria in cucumber

    Get PDF
    Mitochondria are organelles that have their own DNA; serve as the powerhouses of eukaryotic cells; play important roles in stress responses, programmed cell death, and ageing; and in the vast majority of eukaryotes, are maternally transmitted. Strict maternal transmission of mitochondria makes it difficult to select for better-performing mitochondria, or against deleterious mutations in the mitochondrial DNA. Cucumber is a useful plant for organellar genetics because its mitochondria are paternally transmitted and it possesses one of the largest mitochondrial genomes among all eukaryotes. Recombination among repetitive motifs in the cucumber mitochondrial DNA produces rearrangements associated with strongly mosaic (MSC) phenotypes. We previously reported nuclear control of sorting among paternally transmitted mitochondrial DNAs. The goal of this project was to map paternal sorting of mitochondria as a step towards its eventual cloning. We crossed single plants from plant introduction (PI) 401734 and Cucumis sativus var. hardwickii and produced an F2 family. A total of 425 F2 plants were genotyped for molecular markers and testcrossed as the female with MSC16. Testcross families were scored for frequencies of wild-type versus MSC progenies. Discrete segregations for percent wild-type progenies were not observed and paternal sorting of mitochondria was therefore analyzed as a quantitative trait. A major quantitative trait locus (QTL; LOD \u3e23) was mapped between two simple sequence repeats encompassing a 459-kb region on chromosome 3. Nuclear genes previously shown to affect the prevalence of mitochondrial DNAs (MSH1, OSB1, and RECA homologs) were not located near this major QTL on chromosome 3. Sequencing of this region from PI 401734, together with improved annotation of the cucumber genome, should result in the eventual cloning of paternal sorting of mitochondria and provide insights about nuclear control of organellar-DNA sorting

    Using DNA microarrays to study host-microbe interactions.

    Get PDF
    Complete genomic sequences of microbial pathogens and hosts offer sophisticated new strategies for studying host-pathogen interactions. DNA microarrays exploit primary sequence data to measure transcript levels and detect sequence polymorphisms, for every gene, simultaneously. The design and construction of a DNA microarray for any given microbial genome are straightforward. By monitoring microbial gene expression, one can predict the functions of uncharacterized genes, probe the physiologic adaptations made under various environmental conditions, identify virulence-associated genes, and test the effects of drugs. Similarly, by using host gene microarrays, one can explore host response at the level of gene expression and provide a molecular description of the events that follow infection. Host profiling might also identify gene expression signatures unique for each pathogen, thus providing a novel tool for diagnosis, prognosis, and clinical management of infectious disease

    Sequence analysis of two alleles reveals that intra-and intergenic recombination played a role in the evolution of the radish fertility restorer (Rfo)

    Get PDF
    Background \ud Land plant genomes contain multiple members of a eukaryote-specific gene family encoding proteins with pentatricopeptide repeat (PPR) motifs. Some PPR proteins were shown to participate in post-transcriptional events involved in organellar gene expression, and this type of function is now thought to be their main biological role. Among PPR genes, restorers of fertility (Rf) of cytoplasmic male sterility systems constitute a peculiar subgroup that is thought to evolve in response to the presence of mitochondrial sterility-inducing genes. Rf genes encoding PPR proteins are associated with very close relatives on complex loci. \ud Results \ud We sequenced a non-restoring allele (L7rfo) of the Rfo radish locus whose restoring allele (D81Rfo) was previously described, and compared the two alleles and their PPR genes. We identified a ca 13 kb long fragment, likely originating from another part of the radish genome, inserted into the L7rfo sequence. The L7rfo allele carries two genes (PPR-1 and PPR-2) closely related to the three previously described PPR genes of the restorer D81Rfo allele (PPR-A, PPR-B, and PPR-C). Our results indicate that alleles of the Rfo locus have experienced complex evolutionary events, including recombination and insertion of extra-locus sequences, since they diverged. Our \ud analyses strongly suggest that present coding sequences of Rfo PPR genes result from intragenic recombination. We found that the 10 C-terminal PPR repeats in Rfo PPR gene encoded proteins result from the tandem duplication of a 5 PPR repeat block. \ud Conclusions \ud The Rfo locus appears to experience more complex evolution than its flanking \ud sequences. The Rfo locus and PPR genes therein are likely to evolve as a result of \ud intergenic and intragenic recombination. It is therefore not possible to determine which genes on the two alleles are direct orthologs. Our observations recall some \ud previously reported data on pathogen resistance complex loci. \u

    Hierarchical structure of cascade of primary and secondary periodicities in Fourier power spectrum of alphoid higher order repeats

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identification of approximate tandem repeats is an important task of broad significance and still remains a challenging problem of computational genomics. Often there is no single best approach to periodicity detection and a combination of different methods may improve the prediction accuracy. Discrete Fourier transform (DFT) has been extensively used to study primary periodicities in DNA sequences. Here we investigate the application of DFT method to identify and study alphoid higher order repeats.</p> <p>Results</p> <p>We used method based on DFT with mapping of symbolic into numerical sequence to identify and study alphoid higher order repeats (HOR). For HORs the power spectrum shows equidistant frequency pattern, with characteristic two-level hierarchical organization as signature of HOR. Our case study was the 16 mer HOR tandem in AC017075.8 from human chromosome 7. Very long array of equidistant peaks at multiple frequencies (more than a thousand higher harmonics) is based on fundamental frequency of 16 mer HOR. Pronounced subset of equidistant peaks is based on multiples of the fundamental HOR frequency (multiplication factor <it>n </it>for <it>n</it>mer) and higher harmonics. In general, <it>n</it>mer HOR-pattern contains equidistant secondary periodicity peaks, having a pronounced subset of equidistant primary periodicity peaks. This hierarchical pattern as signature for HOR detection is robust with respect to monomer insertions and deletions, random sequence insertions etc. For a monomeric alphoid sequence only primary periodicity peaks are present. The 1/<it>f</it><sup><it>β </it></sup>– noise and periodicity three pattern are missing from power spectra in alphoid regions, in accordance with expectations.</p> <p>Conclusion</p> <p>DFT provides a robust detection method for higher order periodicity. Easily recognizable HOR power spectrum is characterized by hierarchical two-level equidistant pattern: higher harmonics of the fundamental HOR-frequency (secondary periodicity) and a subset of pronounced peaks corresponding to constituent monomers (primary periodicity). The number of lower frequency peaks (secondary periodicity) below the frequency of the first primary periodicity peak reveals the size of <it>n</it>mer HOR, i.e., the number <it>n </it>of monomers contained in consensus HOR.</p

    PhyloToL: A Taxon/Gene-Rich Phylogenomic Pipeline to Explore Genome Evolution of Diverse Eukaryotes

    Get PDF
    Estimating multiple sequence alignments (MSAs) and inferring phylogenies are essential for many aspects of comparative biology. Yet, many bioinformatics tools for such analyses have focused on specific clades, with greatest attention paid to plants, animals, and fungi. The rapid increase in high-throughput sequencing (HTS) data from diverse lineages now provides opportunities to estimate evolutionary relationships and gene family evolution across the eukaryotic tree of life. At the same time, these types of data are known to be error-prone (e.g., substitutions, contamination). To address these opportunities and challenges, we have refined a phylogenomic pipeline, now named PhyloToL, to allow easy incorporation of data from HTS studies, to automate production of both MSAs and gene trees, and to identify and remove contaminants. PhyloToL is designed for phylogenomic analyses of diverse lineages across the tree of life (i.e., at scales of \u3e100 My). We demonstrate the power of PhyloToL by assessing stop codon usage in Ciliophora, identifying contamination in a taxon- and gene-rich database and exploring the evolutionary history of chromosomes in the kinetoplastid parasite Trypanosoma brucei, the causative agent of African sleeping sickness. Benchmarking PhyloToL\u27s homology assessment against that of OrthoMCL and a published paper on superfamilies of bacterial and eukaryotic organellar outer membrane pore-forming proteins demonstrates the power of our approach for determining gene family membership and inferring gene trees. PhyloToL is highly flexible and allows users to easily explore HTS data, test hypotheses about phylogeny and gene family evolution and combine outputs with third-party tools (e.g., PhyloChromoMap, iGTP)

    Mining Unknown Porcine Protein Isoforms by Tissue-Based Map of Proteome Enhances the Pig Genome Annotation

    Get PDF
    A lack of the complete pig proteome has left a gap in our knowledge of the pig genome and has restricted the feasibility of using pigs as a biomedical model. In this study, we developed a tissue-based proteome map using 34 major normal pig tissues. A total of 5841 unknown protein isoforms were identified and systematically characterized, including 2225 novel protein isoforms, 669 protein isoforms from 460 genes symbolized beginning with LOC, and 2947 protein isoforms without clear NCBI annotation in the current pig reference genome. These newly identified protein isoforms were functionally annotated through profiling the pig transcriptome with high-throughput RNA sequencing of the same pig tissues, further improving the genome annotation of the corresponding protein-coding genes. Combining the well-annotated genes that have parallel expression pattern and subcellular witness, we predicted the tissue-related subcellularlocations and potential functions for these unknown proteins. Finally, we mined 3081 orthologous genes for 52.7% of unknown protein isoforms across multiple species, referring to 68 KEGG pathways as well as 23 disease signaling pathways. These findings provide valuable insights and a rich resource for enhancing studies of pig genomics and biology, as well as biomedical model application to human medicine

    Investigation of interspecific genome-plastome incompatibility in Oenothera and Passiflora

    Get PDF
    Interspecific genome-plastome incompatibility is a widely observed phenomenon but its primary causes are still unknown. It reflects genome-plastome interactions that play a direct role in speciation processes, such interspecific combinations of nuclear genomes and plastomes that fail to develop fully autotrophic plants which then are usually eliminated by natural selection. We have investigated two plant models displaying genome-plastome incompatibility, Oenothera and Passiflora, using strategies of molecular biology in order to contribute to an analysis of primary causes of interspecific genome-plastome incompatibility. 1. Expressed sequence tags in Oenothera: In this study we present the first analyzed EST data set for Oenothera. 3,532 cDNA sequences derived from 9-week-old Oenothera plantlets were the analysed and assembled into 1,621 nonredundant clusters, including 1,133 singletons and 488 multi-member unigenes which contain a total of 875,940 nonredundand nucleotides. EST sequences were analysed by Sputnik algorithm. They were also used in the development of gene-specific PCR-based codominant markers (SNPs, CAPS, micro-satellites). The cDNA library could be directly used for macroarray applications including gene expression studies and for physical mapping. 2. Genotyping analyses in Oenothera using AFLP technology: The comparison of AFLPs from Oenothera with AFLPs from Arabidopsis was used to obtain an approximation of the genome size. The genotyping data provide evidence that genome of Oenothera is only six times larger than that of Arabidopsis corresponding to a size of about 750 Mb. The AFLP markers were also successfully applied to construct first genetic maps using F2 mapping population of interspecific hybrids between Oenothera elata ssp. hookeri, line johansen, AA-III, x Oenothera grandiflora ssp. tuscaloosa, BB-III. The linkage maps contain 88 AFLP markers covering a total map length of 154.4 cM for dominant markers in johansen, AA-III and 104 AFLP markers and a total size of 155.3 cM for dominant markers in grandiflora, BB-III. In addition, it was possible to assign genome-plastome incompatibility locus to the margin of coupling group 2B with 13 cM distance to the next AFLP marker. SUMMARY 91 The EST project followed by genotyping analysis increases knowledge and requirements in discovering primary causes of genome-plastome incompatibility. Oenothera with genome-plastome incompatibility, chromosomal translocations and many chromosomal arrangements provides an elegant tool in the study of genomeplastome interactions, speciation processes and species evolution. 3. Investigation of genome-plastome incompatibility in Passiflora: We present the first evidence of hybrid bleaching in this genus. The hybrid between Passiflora menispermifolia x Passiflora oerstedii showed bleaching regions during plant development. Reciprocal crosses have also shown hybrid bleaching but as well significant differences in leaf shape. Molecular analyses of cpDNA showed that Passiflora plastids are inherited bi-parentally and that the P. menispermifolia plastome is incompatible in F1 hybrids with P. oerstedii. This is the first evidence of genome-plastome incompatibility in Passiflora, which differ from Oenothera incompatibilities. The analysis of plastid ultrastructure showed that green tissues in the F1 generation have fully developed chloroplasts with thylakoids and grana; the incompatible material in F1 hybrids lacks differentiated plastids and contains plastids with only rudimentary membranes. An unexpected plastid ultrastructure was found in P. menispermifolia. The leaf from plant growing at greenhouse conditions contains plastids in different development stages including etioplasts, which normally develop from proplastids in darkness. Electron micrographs also indicated retardation of grana formation in P. menispermifolia which shows that vesicles could deliver parts of thylakoid components and that they may directly participate in the formation grana stacks. Northern and Western analyses demonstrated that genome-plastome incompatibility affects both transcription and translation, but with differences for nuclear and plastome encoded genes

    CONSIDERING CYTONUCLEAR INTERACTIONS IN THE FACE OF HETEROPLASMY: EVIDENCE FROM DAUCUS CAROTA (APIACEAE), A GYNODIOECIOUS PLANT SPECIES

    Get PDF
    CONSIDERING CYTONUCLEAR INTERACTIONS IN THE FACE OF HETEROPLASMY: EVIDENCE FROM DAUCUS CAROTA (APIACEAE), A GYNODIOECIOUS PLANT SPECIE

    Mapping of genomes and plastomes of subsection Oenothera with molecular marker technologies

    Get PDF
    corecore