7 research outputs found

    Interrogation of alternative splicing events in duplicated genes during evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene duplication provides resources for developing novel genes and new functions while retaining the original functions. In addition, alternative splicing could increase the complexity of expression at the transcriptome and proteome level without increasing the number of gene copy in the genome. Duplication and alternative splicing are thought to work together to provide the diverse functions or expression patterns for eukaryotes. Previously, it was believed that duplication and alternative splicing were negatively correlated and probably interchangeable.</p> <p>Results</p> <p>We look into the relationship between occurrence of alternative splicing and duplication at different time after duplication events. We found duplication and alternative splicing were indeed inversely correlated if only recently duplicated genes were considered, but they became positively correlated when we took those ancient duplications into account. Specifically, for slightly or moderately duplicated genes with gene families containing 2 - 7 paralogs, genes were more likely to evolve alternative splicing and had on average a greater number of alternative splicing isoforms after long-term evolution compared to singleton genes. On the other hand, those large gene families (contain at least 8 paralogs) had a lower proportion of alternative splicing, and fewer alternative splicing isoforms on average even when ancient duplicated genes were taken into consideration. We also found these duplicated genes having alternative splicing were under tighter evolutionary constraints compared to those having no alternative splicing, and had an enrichment of genes that participate in molecular transducer activities.</p> <p>Conclusions</p> <p>We studied the association between occurrences of alternative splicing and gene duplication. Our results implicate that there are key differences in functions and evolutionary constraints among singleton genes or duplicated genes with or without alternative splicing incidences. It implies that the gene duplication and alternative splicing may have different functional significance in the evolution of speciation diversity.</p

    DODO: an efficient orthologous genes assignment tool based on domain architectures. Domain based ortholog detection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Orthologs are genes derived from the same ancestor gene loci after speciation events. Orthologous proteins usually have similar sequences and perform comparable biological functions. Therefore, ortholog identification is useful in annotations of newly sequenced genomes. With rapidly increasing number of sequenced genomes, constructing or updating ortholog relationship between all genomes requires lots of effort and computation time. In addition, elucidating ortholog relationships between distantly related genomes is challenging because of the lower sequence similarity. Therefore, an efficient ortholog detection method that can deal with large number of distantly related genomes is desired.</p> <p>Results</p> <p>An efficient ortholog detection pipeline DODO (DOmain based Detection of Orthologs) is created on the basis of domain architectures in this study. Supported by domain composition, which usually directly related with protein function, DODO could facilitate orthologs detection across distantly related genomes. DODO works in two main steps. Starting from domain information, it first assigns protein groups according to their domain architectures and further identifies orthologs within those groups with much reduced complexity. Here DODO is shown to detect orthologs between two genomes in considerably shorter period of time than traditional methods of reciprocal best hits and it is more significant when analyzed a large number of genomes. The output results of DODO are highly comparable with other known ortholog databases.</p> <p>Conclusions</p> <p>DODO provides a new efficient pipeline for detection of orthologs in a large number of genomes. In addition, a database established with DODO is also easier to maintain and could be updated relatively effortlessly. The pipeline of DODO could be downloaded from <url>http://140.109.42.19:16080/dodo_web/home.htm</url></p

    Meta-analytical biomarker search of EST expression data reveals three differentially expressed candidates

    No full text
    Abstract Background Researches have been conducted for the identification of differentially expressed genes (DEGs) by generating and mining of cDNA expressed sequence tags (ESTs) for more than a decade. Although the availability of public databases make possible the comprehensive mining of DEGs among the ESTs from multiple tissue types, existing studies usually employed statistics suitable only for two categories. Multi-class test has been developed to enable the finding of tissue specific genes, but subsequent search for cancer genes involves separate two-category test only on the ESTs of the tissue of interest. This constricts the amount of data used. On the other hand, simple pooling of cancer and normal genes from multiple tissue types runs the risk of Simpson's paradox. Here we presented a different approach which searched for multi-cancer DEG candidates by analyzing all pertinent ESTs in all categories and narrowing down the cancer biomarker candidates via integrative analysis with microarray data and selection of secretory and membrane protein genes as well as incorporation of network analysis. Finally, the differential expression patterns of three selected cancer biomarker candidates were confirmed by real-time qPCR analysis. Results Seven hundred and twenty three primary DEG candidates (p-value in silico predictions. Conclusions Searching digitized transcriptome using CMH enabled us to identify multi-cancer differentially expressed gene candidates. Our methodology demonstrated simultaneously analysis for cancer biomarkers of multiple tissue types with the EST data. With the revived interest in digitizing the transcriptomes by NGS, cancer biomarkers could be more precisely detected from the ESTs. The three candidates identified in this study, COL3A1, DLG3, and RNF43, are valuable targets for further evaluation with a larger sample size of normal and cancer tissue or serum samples.</p

    Genome sequence of Haloarcula marismortui: A halophilic archaeon from the Dead Sea

    No full text
    We report the complete sequence of the 4,274,642-bp genome of Haloarcula marismortui, a halophilic archaeal isolate from the Dead Sea. The genome is organized into nine circular replicons of varying G+C compositions ranging from 54% to 62%. Comparison of the genome architectures of Halobacterium sp. NRC-1 and H. marismortui suggests a common ancestor for the two organisms and a genome of significantly reduced size in the former. Both of these halophilic archaea use the same strategy of high surface negative charge of folded proteins as means to circumvent the salting-out phenomenon in a hypersaline cytoplasm. A multitiered annotation approach, including primary sequence similarities, protein family signatures, structure prediction, and a protein function association network, has assigned putative functions for at least 58% of the 4242 predicted proteins, a far larger number than is usually achieved in most newly sequenced microorganisms. Among these assigned functions were genes encoding six opsins, 19 MCP and/or HAMP domain signal transducers, and an unusually large number of environmental response regulators—nearly five times as many as those encoded in Halobacterium sp. NRC-1—suggesting H. marismortui is significantly more physiologically capable of exploiting diverse environments. In comparing the physiologies of the two halophilic archaea, in addition to the expected extensive similarity, we discovered several differences in their metabolic strategies and physiological responses such as distinct pathways for arginine breakdown in each halophile. Finally, as expected from the larger genome, H. marismortui encodes many more functions and seems to have fewer nutritional requirements for survival than does Halobacterium sp. NRC-1
    corecore