58 research outputs found

    EL_PSSM-RT:DNA-binding residue prediction by integrating ensemble learning with PSSM Relation Transformation

    Get PDF
    Background: Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not consider the relationships of evolutionary information between residues. Results: In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues. PDNA-62 and PDNA-224 are used to evaluate PSSM-RT and two existing PSSM encoding methods by five-fold cross-validation. Performance evaluations indicate that PSSM-RT is more effective than previous methods. This validates the point that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction. An ensemble learning classifier (EL_PSSM-RT) is also proposed by combining ensemble learning model and PSSM-RT to better handle the imbalance between binding and non-binding residues in datasets. EL_PSSM-RT is evaluated by five-fold cross-validation using PDNA-62 and PDNA-224 as well as two independent datasets TS-72 and TS-61. Performance comparisons with existing predictors on the four datasets demonstrate that EL_PSSM-RT is the best-performing method among all the predicting methods with improvement between 0.02-0.07 for MCC, 4.18-21.47% for ST and 0.013-0.131 for AUC. Furthermore, we analyze the importance of the pair-relationships extracted by PSSM-RT and the results validates the usefulness of PSSM-RT for encoding DNA-binding residues. Conclusions: We propose a novel prediction method for the prediction of DNA-binding residue with the inclusion of relationship of evolutionary information and ensemble learning. Performance evaluation shows that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction and ensemble learning can be used to address the data imbalance issue between binding and non-binding residues. A web service of EL_PSSM-RT ( http://hlt.hitsz.edu.cn:8080/PSSM-RT_SVM/ ) is provided for free access to the biological research community

    Impact of DNA methylation on trophoblast function

    Get PDF
    The influence of epigenetics is evident in many fields of medicine today. This is also true in placentology, where versatile epigenetic mechanisms that regulate expression of genes have shown to have important influence on trophoblast implantation and placentation. Such gene regulation can be established in different ways and on different molecular levels, the most common being the DNA methylation. DNA methylation has been shown today as an important predictive component in assessing clinical prognosis of certain malignant tumors; in addition, it opens up new possibilities for non-invasive prenatal diagnosis utilizing cell-free fetal DNA methods. By using a well known demethylating agent 5-azacytidine in pregnant rat model, we have been able to change gene expression and, consequently, the processes of trophoblast differentiation and placental development. In this review, we describe how changes in gene methylation effect trophoblast development and placentation and offer our perspective on use of trophoblast epigenetic research for better understanding of not only placenta development but cancer cell growth and invasion as well

    Comparative analysis of the ATRX promoter and 5' regulatory region reveals conserved regulatory elements which are linked to roles in neurodevelopment, alpha-globin regulation and testicular function

    Get PDF
    BACKGROUND ATRX is a tightly-regulated multifunctional protein with crucial roles in mammalian development. Mutations in the ATRX gene cause ATR-X syndrome, an X-linked recessive developmental disorder resulting in severe mental retardation and mild alpha-thalassemia with facial, skeletal and genital abnormalities. Although ubiquitously expressed the clinical features of the syndrome indicate that ATRX is not likely to be a global regulator of gene expression but involved in regulating specific target genes. The regulation of ATRX expression is not well understood and this is reflected by the current lack of identified upstream regulators. The availability of genomic data from a range of species and the very highly conserved 5' regulatory regions of the ATRX gene has allowed us to investigate putative transcription factor binding sites (TFBSs) in evolutionarily conserved regions of the mammalian ATRX promoter. RESULTS We identified 12 highly conserved TFBSs of key gene regulators involved in biologically relevant processes such as neural and testis development and alpha-globin regulation. CONCLUSIONS Our results reveal potentially important regulatory elements in the ATRX gene which may lead to the identification of upstream regulators of ATRX and aid in the understanding of the molecular mechanisms that underlie ATR-X syndrome.This work was supported by Department of Zoology research grants

    Acquisition of Human-Type Receptor Binding Specificity by New H5N1 Influenza Virus Sublineages during Their Emergence in Birds in Egypt

    Get PDF
    Highly pathogenic avian influenza A virus subtype H5N1 is currently widespread in Asia, Europe, and Africa, with 60% mortality in humans. In particular, since 2009 Egypt has unexpectedly had the highest number of human cases of H5N1 virus infection, with more than 50% of the cases worldwide, but the basis for this high incidence has not been elucidated. A change in receptor binding affinity of the viral hemagglutinin (HA) from α2,3- to α2,6-linked sialic acid (SA) is thought to be necessary for H5N1 virus to become pandemic. In this study, we conducted a phylogenetic analysis of H5N1 viruses isolated between 2006 and 2009 in Egypt. The phylogenetic results showed that recent human isolates clustered disproportionally into several new H5 sublineages suggesting that their HAs have changed their receptor specificity. Using reverse genetics, we found that these H5 sublineages have acquired an enhanced binding affinity for α2,6 SA in combination with residual affinity for α2,3 SA, and identified the amino acid mutations that produced this new receptor specificity. Recombinant H5N1 viruses with a single mutation at HA residue 192 or a double mutation at HA residues 129 and 151 had increased attachment to and infectivity in the human lower respiratory tract but not in the larynx. These findings correlated with enhanced virulence of the mutant viruses in mice. Interestingly, these H5 viruses, with increased affinity to α2,6 SA, emerged during viral diversification in bird populations and subsequently spread to humans. Our findings suggested that emergence of new H5 sublineages with α2,6 SA specificity caused a subsequent increase in human H5N1 influenza virus infections in Egypt, and provided data for understanding the virus's pandemic potential

    The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

    Get PDF
    Background: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Results: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole genome mutation screening in Candida albicans and aeruginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. Conclusion: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens

    An expanded evaluation of protein function prediction methods shows an improvement in accuracy

    Get PDF
    Background: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.Results: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.Conclusions: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent
    corecore