5 research outputs found

    The Genome of Streptococcus mitis B6 - What Is a Commensal?

    Get PDF
    Streptococcus mitis is the closest relative of the major human pathogen S. pneumoniae. The 2,15 Mb sequence of the Streptococcus mitis B6 chromosome, an unusually high-level beta-lactam resistant and multiple antibiotic resistant strain, has now been determined to encode 2100 genes. The accessory genome is estimated to represent over 40%, including 75 mostly novel transposases and IS, the prophage φB6 and another seven phage related regions. Tetracycline resistance mediated by Tn5801, and an unusual and large gene cluster containing three aminoglycoside resistance determinants have not been described in other Streptococcus spp. Comparative genomic analyses including hybridization experiments on a S. mitis B6 specific microarray reveal that individual S. mitis strains are almost as distantly related to the B6 strain as S. pneumoniae. Both species share a core of over 900 genes. Most proteins described as pneumococcal virulence factors are present in S. mitis B6, but the three choline binding proteins PcpA, PspA and PspC, and three gene clusters containing the hyaluronidase gene, ply and lytA, and the capsular genes are absent in S. mitis B6 and other S. mitis as well and confirm their importance for the pathogenetic potential of S. pneumoniae. Despite the close relatedness between the two species, the S. mitis B6 genome reveals a striking X-alignment when compared with S. pneumoniae

    Conservation of Microstructure between a Sequenced Region of the Genome of Rice and Multiple Segments of the Genome of Arabidopsis thaliana

    No full text
    The nucleotide sequence was determined for a 340-kb segment of rice chromosome 2, revealing 56 putative protein-coding genes. This represents a density of one gene per 6.1 kb, which is higher than was reported for a previously sequenced segment of the rice genome. Sixteen of the putative genes were supported by matches to ESTs. The predicted products of 29 of the putative genes showed similarity to known proteins, and a further 17 genes showed similarity only to predicted or hypothetical proteins identified in genome sequence data. The region contains a few transposable elements: one retrotransposon, and one transposon. The segment of the rice genome studied had previously been identified as representing a part of rice chromosome 2 that may be homologous to a segment of Arabidopsis chromosome 4. We confirmed the conservation of gene content and order between the two genome segments. In addition, we identified a further four segments of the Arabidopsis genome that contain conserved gene content and order. In total, 22 of the 56 genes identified in the rice genome segment were represented in this set of Arabidopsis genome segments, with at least five genes present, in conserved order, in each segment. These data are consistent with the hypothesis that the Arabidopsis genome has undergone multiple duplication events. Our results demonstrate that conservation of the genome microstructure can be identified even between monocot and dicot species. However, the frequent occurrence of duplication, and subsequent microstructure divergence, within plant genomes may necessitate the integration of subsets of genes present in multiple redundant segments to deduce evolutionary relationships and identify orthologous genes

    Toward a Catalog of Human Genes and Proteins: Sequencing and Analysis of 500 Novel Complete Protein Coding Human cDNAs

    Get PDF
    With the complete human genomic sequence being unraveled, the focus will shift to gene identification and to the functional analysis of gene products. The generation of a set of cDNAs, both sequences and physical clones, which contains the complete and noninterrupted protein coding regions of all human genes will provide the indispensable tools for the systematic and comprehensive analysis of protein function to eventually understand the molecular basis of man. Here we report the sequencing and analysis of 500 novel human cDNAs containing the complete protein coding frame. Assignment to functional categories was possible for 52% (259) of the encoded proteins, the remaining fraction having no similarities with known proteins. By aligning the cDNA sequences with the sequences of the finished chromosomes 21 and 22 we identified a number of genes that either had been completely missed in the analysis of the genomic sequences or had been wrongly predicted. Three of these genes appear to be present in several copies. We conclude that full-length cDNA sequencing continues to be crucial also for the accurate identification of genes. The set of 500 novel cDNAs, and another 1000 full-coding cDNAs of known transcripts we have identified, adds up to cDNA representations covering 2%–5 % of all human genes. We thus substantially contribute to the generation of a gene catalog, consisting of both full-coding cDNA sequences and clones, which should be made freely available and will become an invaluable tool for detailed functional studies. [The sequence data described in this paper have been submitted to the EMBL database under the accession nos. given in Table 2.

    The genome of the kinetoplastid parasite, Leishmania major

    Get PDF
    Leishmania species cause a spectrum of human diseases in tropical and subtropical regions of the world. We have sequenced the 36 chromosomes of the 32.8-megabase haploid genome of Leishmania major (Friedlin strain) and predict 911 RNA genes, 39 pseudogenes, and 8272 protein-coding genes, of which 36% can be ascribed a putative function. These include genes involved in host-pathogen interactions, such as proteolytic enzymes, and extensive machinery for synthesis of complex surface glycoconjugates. The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Trypanosoma cruzi (Tritryp) genomes suggest that the mechanisms regulating RNA polymerase II–directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling. Abundant RNA-binding proteins are encoded in the Tritryp genomes, consistent with active posttranscriptional regulation of gen
    corecore