59 research outputs found

    A novel method for high accuracy sumoylation site prediction from protein sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein sumoylation is an essential dynamic, reversible post translational modification that plays a role in dozens of cellular activities, especially the regulation of gene expression and the maintenance of genomic stability. Currently, the complexities of sumoylation mechanism can not be perfectly solved by experimental approaches. In this regard, computational approaches might represent a promising method to direct experimental identification of sumoylation sites and shed light on the understanding of the reaction mechanism.</p> <p>Results</p> <p>Here we presented a statistical method for sumoylation site prediction. A 5-fold cross validation test over the experimentally identified sumoylation sites yielded excellent prediction performance with correlation coefficient, specificity, sensitivity and accuracy equal to 0.6364, 97.67%, 73.96% and 96.71% respectively. Additionally, the predictor performance is maintained when high level homologs are removed.</p> <p>Conclusion</p> <p>By using a statistical method, we have developed a new SUMO site prediction method – SUMOpre, which has shown its great accuracy with correlation coefficient, specificity, sensitivity and accuracy.</p

    Genome dynamics and diversity of Shigella species, the etiologic agents of bacillary dysentery

    Get PDF
    The Shigella bacteria cause bacillary dysentery, which remains a significant threat to public health. The genus status and species classification appear no longer valid, as compelling evidence indicates that Shigella, as well as enteroinvasive Escherichia coli, are derived from multiple origins of E.coli and form a single pathovar. Nevertheless, Shigella dysenteriae serotype 1 causes deadly epidemics but Shigella boydii is restricted to the Indian subcontinent, while Shigella flexneri and Shigella sonnei are prevalent in developing and developed countries respectively. To begin to explain these distinctive epidemiological and pathological features at the genome level, we have carried out comparative genomics on four representative strains. Each of the Shigella genomes includes a virulence plasmid that encodes conserved primary virulence determinants. The Shigella chromosomes share most of their genes with that of E.coli K12 strain MG1655, but each has over 200 pseudogenes, 300∼700 copies of insertion sequence (IS) elements, and numerous deletions, insertions, translocations and inversions. There is extensive diversity of putative virulence genes, mostly acquired via bacteriophage-mediated lateral gene transfer. Hence, via convergent evolution involving gain and loss of functions, through bacteriophage-mediated gene acquisition, IS-mediated DNA rearrangements and formation of pseudogenes, the Shigella spp. became highly specific human pathogens with variable epidemiological and pathological features

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Integrated Analysis of RNA-Binding Proteins in Glioma

    No full text
    RNA-binding proteins (RBPs) play important roles in many cancer types. However, RBPs have not been thoroughly and systematically studied in gliomas. Global analysis of the functional impact of RBPs will provide a better understanding of gliomagenesis and new insights into glioma therapy. In this study, we integrated a list of the human RBPs from six sources&mdash;Gerstberger, SONAR, Gene Ontology project, Poly(A) binding protein, CARIC, and XRNAX&mdash;which covered 4127 proteins with RNA-binding activity. The RNA sequencing data were downloaded from The Cancer Genome Atlas (TCGA) (n = 699) and Chinese Glioma Genome Atlas (CGGA) (n = 325 + 693). We examined the differentially expressed genes (DEGs) using the R package DESeq2, and constructed a weighted gene co-expression network analysis (WGCNA) of RBPs. Furthermore, survival analysis was also performed based on the univariate and multivariate Cox proportional hazards regression models. In the WGCNA analysis, we identified a key module involved in the overall survival (OS) of glioblastomas. Survival analysis revealed eight RBPs (PTRF, FNDC3B, SLC25A43, ZC3H12A, LRRFIP1, HSP90B1, HSPA5, and BNC2) are significantly associated with the survival of glioblastoma patients. Another 693 patients within the CGGA database were used to validate the findings. Additionally, 3564 RBPs were classified into canonical and non-canonical RBPs depending on the domains that they contain, and non-canonical RBPs account for the majority (72.95%). The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed that some non-canonical RBPs may have functions in glioma. Finally, we found that the knockdown of non-canonical RBPs, PTRF, or FNDC3B can alone significantly inhibit the proliferation of LN229 and U251 cells. Simultaneously, RNA Immunoprecipitation (RIP) analysis indicated that PTRF may regulate cell growth and death- related pathways to maintain tumor cell growth. In conclusion, our findings presented an integrated view to assess the potential death risks of glioblastoma at a molecular level, based on the expression of RBPs. More importantly, we identified non-canonical RNA-binding proteins PTRF and FNDC3B, showing them to be potential prognostic biomarkers for glioblastoma

    The complete chloroplast genome sequence of Rhaponticum uniflorum, the first of the genus Rhaponticum

    No full text
    Rhaponticum uniflorum is commonly used as a source for traditional medicines with the main effect of clearing heat. Here, we sequenced the complete chloroplast (cp) genome of R. uniflorum to develop molecular markers for taxonomic classification and species determination of R. uniflorum. It was 152,760 bp in size and has a typical circular structure, including a pair of inverted repeats with 25,205 bp, a large single-copy region with 83,687 bp, and a small single copy region with 18,663 bp. The genome encodes 110 unique genes, including 80 protein-coding, four rRNA and 26 tRNA genes. Phylogenomic analysis shows that R. uniflorum is closely related to the Saussurea. The study is useful for phylogenetic and population genetic studies of Rhaponticum plants
    • …
    corecore