19 research outputs found

    Drawing the tree of eukaryotic life based on the analysis of 2,269 manually annotated myosins from 328 species

    Get PDF
    The tree of eukaryotic life was reconstructed based on the analysis of 2,269 myosin motor domains from 328 organisms, confirming some accepted relationships of major taxa and resolving disputed and preliminary classifications

    Reconstructing the phylogeny of 21 completely sequenced arthropod species based on their motor proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Motor proteins have extensively been studied in the past and consist of large superfamilies. They are involved in diverse processes like cell division, cellular transport, neuronal transport processes, or muscle contraction, to name a few. Vertebrates contain up to 60 myosins and about the same number of kinesins that are spread over more than a dozen distinct classes.</p> <p>Results</p> <p>Here, we present the comparative genomic analysis of the motor protein repertoire of 21 completely sequenced arthropod species using the owl limpet <it>Lottia gigantea </it>as outgroup. Arthropods contain up to 17 myosins grouped into 13 classes. The myosins are in almost all cases clear paralogs, and thus the evolution of the arthropod myosin inventory is mainly determined by gene losses. Arthropod species contain up to 29 kinesins spread over 13 classes. In contrast to the myosins, the evolution of the arthropod kinesin inventory is not only determined by gene losses but also by many subtaxon-specific and species-specific gene duplications. All arthropods contain each of the subunits of the cytoplasmic dynein/dynactin complex. Except for the dynein light chains and the p150 dynactin subunit they contain single gene copies of the other subunits. Especially the roadblock light chain repertoire is very species-specific.</p> <p>Conclusion</p> <p>All 21 completely sequenced arthropods, including the twelve sequenced <it>Drosophila </it>species, contain a species-specific set of motor proteins. The phylogenetic analysis of all genes as well as the protein repertoire placed <it>Daphnia pulex </it>closest to the root of the Arthropoda. The louse <it>Pediculus humanus corporis </it>is the closest relative to <it>Daphnia </it>followed by the group of the honeybee <it>Apis mellifera </it>and the jewel wasp <it>Nasonia vitripenni</it>s. After this group the rust-red flour beetle <it>Tribolium castaneum </it>and the silkworm <it>Bombyx mori </it>diverged very closely from the lineage leading to the <it>Drosophila </it>species.</p

    diArk – a resource for eukaryotic genome research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The number of completed eukaryotic genome sequences and cDNA projects has increased exponentially in the past few years although most of them have not been published yet. In addition, many microarray analyses yielded thousands of sequenced EST and cDNA clones. For the researcher interested in single gene analyses (from a phylogenetic, a structural biology or other perspective) it is therefore important to have up-to-date knowledge about the various resources providing primary data.</p> <p>Description</p> <p>The database is built around 3 central tables: species, sequencing projects and publications. The species table contains commonly and alternatively used scientific names, common names and the complete taxonomic information. For projects the sequence type and links to species project web-sites and species homepages are stored. All publications are linked to projects. The web-interface provides comprehensive search modules with detailed options and three different views of the selected data. We have especially focused on developing an elaborate taxonomic tree search tool that allows the user to instantaneously identify e.g. the closest relative to the organism of interest.</p> <p>Conclusion</p> <p>We have developed a database, called diArk, to store, organize, and present the most relevant information about completed genome projects and EST/cDNA data from eukaryotes. Currently, diArk provides information about 415 eukaryotes, 823 sequencing projects, and 248 publications.</p

    Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alternative splicing of pre-mature RNA is an important process eukaryotes utilize to increase their repertoire of different protein products. Several types of different alternative splice forms exist including exon skipping, differential splicing of exons at their 3'- or 5'-end, intron retention, and mutually exclusive splicing. The latter term is used for clusters of internal exons that are spliced in a mutually exclusive manner.</p> <p>Results</p> <p>We have implemented an extension to the WebScipio software to search for mutually exclusive exons. Here, the search is based on the precondition that mutually exclusive exons encode regions of the same structural part of the protein product. This precondition provides restrictions to the search for candidate exons concerning their length, splice site conservation and reading frame preservation, and overall homology. Mutually exclusive exons that are not homologous and not of about the same length will not be found. Using the new algorithm, mutually exclusive exons in several example genes, a dynein heavy chain, a muscle myosin heavy chain, and Dscam were correctly identified. In addition, the algorithm was applied to the whole <it>Drosophila melanogaster </it>X chromosome and the results were compared to the Flybase annotation and an <it>ab initio </it>prediction. Clusters of mutually exclusive exons might be subsequent to each other and might encode dozens of exons.</p> <p>Conclusions</p> <p>This is the first implementation of an automatic search for mutually exclusive exons in eukaryotes. Exons are predicted and reconstructed in the same run providing the complete gene structure for the protein query of interest. WebScipio offers high quality gene structure figures with the clusters of mutually exclusive exons colour-coded, and several analysis tools for further manual inspection. The genome scale analysis of all genes of the <it>Drosophila melanogaster </it>X chromosome showed that WebScipio is able to find all but two of the 28 annotated mutually exclusive spliced exons and predicts 39 new candidate exons. Thus, WebScipio should be able to identify mutually exclusive spliced exons in any query sequence from any species with a very high probability. WebScipio is freely available to academics at <url>http://www.webscipio.org</url>.</p

    WebScipio: An online tool for the determination of gene structures using protein sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Obtaining the gene structure for a given protein encoding gene is an important step in many analyses. A software suited for this task should be readily accessible, accurate, easy to handle and should provide the user with a coherent representation of the most probable gene structure. It should be rigorous enough to optimise features on the level of single bases and at the same time flexible enough to allow for cross-species searches.</p> <p>Results</p> <p>WebScipio, a web interface to the Scipio software, allows a user to obtain the corresponding coding sequence structure of a here given a query protein sequence that belongs to an already assembled eukaryotic genome. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies. WebScipio can also be used to identify and characterise the gene structures of homologs in related organisms. In addition, it offers a web service for integration with other programs.</p> <p>Conclusion</p> <p>WebScipio is a tool that allows users to get a high-quality gene structure prediction from a protein query. It offers more than 250 eukaryotic genomes that can be searched and produces predictions that are close to what can be achieved by manual annotation, for in-species and cross-species searches alike. WebScipio is freely accessible at <url>http://www.webscipio.org</url>.</p

    GenePainter: a fast tool for aligning gene structures of eukaryotic protein families, visualizing the alignments and mapping gene structures onto protein structures

    Get PDF
    Background: All sequenced eukaryotic genomes have been shown to possess at least a few introns. This includes those unicellular organisms, which were previously suspected to be intron-less. Therefore, gene splicing must have been present at least in the last common ancestor of the eukaryotes. To explain the evolution of introns, basically two mutually exclusive concepts have been developed. The introns-early hypothesis says that already the very first protein-coding genes contained introns while the introns-late concept asserts that eukaryotic genes gained introns only after the emergence of the eukaryotic lineage. A very important aspect in this respect is the conservation of intron positions within homologous genes of different taxa. Results: GenePainter is a standalone application for mapping gene structure information onto protein multiple sequence alignments. Based on the multiple sequence alignments the gene structures are aligned down to single nucleotides. GenePainter accounts for variable lengths in exons and introns, respects split codons at intron junctions and is able to handle sequencing and assembly errors, which are possible reasons for frame-shifts in exons and gaps in genome assemblies. Thus, even gene structures of considerably divergent proteins can properly be compared, as it is needed in phylogenetic analyses. Conserved intron positions can also be mapped to user-provided protein structures. For their visualization GenePainter provides scripts for the molecular graphics system PyMol

    diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Nowadays, the sequencing of even the largest mammalian genomes has become a question of days with current next-generation sequencing methods. It comes as no surprise that dozens of genome assemblies are released per months now. Since the number of next-generation sequencing machines increases worldwide and new major sequencing plans are announced, a further increase in the speed of releasing genome assemblies is expected. Thus it becomes increasingly important to get an overview as well as detailed information about available sequenced genomes. The different sequencing and assembly methods have specific characteristics that need to be known to evaluate the various genome assemblies before performing subsequent analyses.</p> <p>Results</p> <p>diArk has been developed to provide fast and easy access to all sequenced eukaryotic genomes worldwide. Currently, diArk 2.0 contains information about more than 880 species and more than 2350 genome assembly files. Many meta-data like sequencing and read-assembly methods, sequencing coverage, GC-content, extended lists of alternatively used scientific names and common species names, and various kinds of statistics are provided. To intuitively approach the data the web interface makes extensive usage of modern web techniques. A number of search modules and result views facilitate finding and judging the data of interest. Subscribing to the RSS feed is the easiest way to stay up-to-date with the latest genome data.</p> <p>Conclusions</p> <p>diArk 2.0 is the most up-to-date database of sequenced eukaryotic genomes compared to databases like GOLD, NCBI Genome, NHGRI, and ISC. It is different in that only those projects are stored for which genome assembly data or considerable amounts of cDNA data are available. Projects in planning stage or in the process of being sequenced are not included. The user can easily search through the provided data and directly access the genome assembly files of the sequenced genome of interest. diArk 2.0 is available at <url>http://www.diark.org</url>.</p

    Comparative genomic analysis of the arthropod muscle myosin heavy chain genes allows ancestral gene reconstruction and reveals a new type of 'partially' processed pseudogene

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alternative splicing of mutually exclusive exons is an important mechanism for increasing protein diversity in eukaryotes. The insect <it>Mhc </it>(myosin heavy chain) gene produces all different muscle myosins as a result of alternative splicing in contrast to most other organisms of the Metazoa lineage, that have a family of muscle genes with each gene coding for a protein specialized for a functional niche.</p> <p>Results</p> <p>The muscle myosin heavy chain genes of 22 species of the Arthropoda ranging from the waterflea to wasp and <it>Drosophila </it>have been annotated. The analysis of the gene structures allowed the reconstruction of an ancient muscle myosin heavy chain gene and showed that during evolution of the arthropods introns have mainly been lost in these genes although intron gain might have happened in a few cases. Surprisingly, the genome of <it>Aedes aegypti </it>contains another and that of <it>Culex pipiens quinquefasciatus </it>two further muscle myosin heavy chain genes, called <it>Mhc3 </it>and <it>Mhc4</it>, that contain only one variant of the corresponding alternative exons of the <it>Mhc1 </it>gene. <it>Mhc3 </it>transcription in <it>Aedes aegypti </it>is documented by EST data. <it>Mhc3 </it>and <it>Mhc4 </it>inserted in the <it>Aedes </it>and <it>Culex </it>genomes either by gene duplication followed by the loss of all but one variant of the alternative exons, or by incorporation of a transcript of which all other variants have been spliced out retaining the exon-intron structure. The second and more likely possibility represents a new type of a 'partially' processed pseudogene.</p> <p>Conclusion</p> <p>Based on the comparative genomic analysis of the alternatively spliced arthropod muscle myosin heavy chain genes we propose that the splicing process operates sequentially on the transcript. The process consists of the splicing of the mutually exclusive exons until one exon out of the cluster remains while retaining surrounding intronic sequence. In a second step splicing of introns takes place. A related mechanism could be responsible for the splicing of other genes containing mutually exclusive exons.</p

    Establishment and Characterisation of an in vitro Replication System with Human Cell Extracts

    No full text
    In the work presented, I was able to characterise several aspects of an in vitro DNA replication system with human cell extracts.I could confirm that plasmids without special sequence characteristics are replicated by the system and that the replication of each template takes place only once, resembling the way genomic DNA is replicated in vivo. The occurrence of different kinds of replication intermediates was shown by electron microscopy, and the fate of the template DNA during the reaction was clarified. I also demonstrated that only a specific form of DNA can serve as substrate and that the products of the reaction can be separated by differential digest in a reasonable manner. Furthermore I was able to proof that the factors responsible for replication in vitro are the the same ones driving the reaction in the cell. It could also be demonstrated that the efficiency of the reaction depends on the cell cycle stage of the cells the protein extracts were prepared from. At last the reaction could be inhibited by depletion of ORC proteins, although the inhibition could not be reverted by the addition of recombinantly expressed ORC complexes.In conclusion this work is a contribution towards the complete characterisation of an in vitro replication assay as a model for the replication of the genome in human cells
    corecore