17 research outputs found

    Genome-wide analysis of salt-responsive and novel microRNAs in Populus euphratica by deep sequencing

    Full text link

    An Overview of the Prediction of Protein DNA-Binding Sites

    No full text
    Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications

    Computational Prediction of RNA-Binding Proteins and Binding Sites

    No full text
    Proteins and RNA interaction have vital roles in many cellular processes such as protein synthesis, sequence encoding, RNA transfer, and gene regulation at the transcriptional and post-transcriptional levels. Approximately 6%ā€“8% of all proteins are RNA-binding proteins (RBPs). Distinguishing these RBPs or their binding residues is a major aim of structural biology. Previously, a number of experimental methods were developed for the determination of proteinā€“RNA interactions. However, these experimental methods are expensive, time-consuming, and labor-intensive. Alternatively, researchers have developed many computational approaches to predict RBPs and proteinā€“RNA binding sites, by combining various machine learning methods and abundant sequence and/or structural features. There are three kinds of computational approaches, which are prediction from protein sequence, prediction from protein structure, and protein-RNA docking. In this paper, we review all existing studies of predictions of RNA-binding sites and RBPs and complexes, including data sets used in different approaches, sequence and structural features used in several predictors, prediction method classifications, performance comparisons, evaluation methods, and future directions

    Association Genetics in Populus Reveal the Allelic Interactions of Pto-MIR167a and Its Targets in Wood Formation

    No full text
    MicroRNAs (miRNAs) play crucial regulatory roles in plant growth and development by interacting with RNA molecules, including messenger RNAs (mRNAs) and long non-coding RNAs (lncRNAs); however, the genetic networks of miRNAs and their targets influencing the phenotypes of perennial trees remain to be investigated. Here, we integrated expression profiling and association analysis of underlying physiology and expression traits to dissect the allelic variations and genetic interactions of Pto-MIR167a and its targets, sponge lncRNA ARFRL, and Pto-ARF8, in 435 unrelated individuals of Populus tomentosa. Tissue-specific expression analysis in eight tissues, including stem, leaf, root, and shoot apex, revealed negative correlations between Pto-MIR167a and lncRNA ARFRL and Pto-ARF8 (r = āˆ’0.60 and āˆ’0.61, respectively, P < 0.01), and a positive correlation between sponge lncRNA ARFRL and Pto-ARF8 (r = 0.90, P < 0.01), indicating their potential regulatory roles in tree growth and wood formation. Single nucleotide polymorphism (SNP)-based association studies detected 53 significant associations (P < 0.01, Q < 0.1) representing 41 unique SNPs from the three genes and six traits, suggesting their potential roles in wood formation. Epistasis uncovered 88 pairwise interactions for 10 traits, which provided substantial evidence for genetic interactions among Pto-MIR167a, lncRNA ARFRL, and Pto-ARF8. Using gene expression-based association mapping, we also examined SNPs within the three genes that influence phenotypes by regulating the expression of Pto-ARF8. Interestingly, SNPs in the precursor region of Pto-MIR167a altered its secondary structure stability and transcription, thereby affecting the expression of its targets. In summary, we elucidated the genetic interactions between Pto-MIR167a and its targets, sponge lncRNA ARFRL, and Pto-ARF8, in tree growth and wood formation, and provide a feasible method for further investigation of multi-factor genetic networks influencing phenotypic variation in the population genetics of trees

    A multiā€modal clustering method for traditonal Chinese medicine clinical data via media convergence

    No full text
    Abstract Media convergence is a media change led by technological innovation. Applying media convergence technology to the study of clustering in Chinese medicine can significantly exploit the advantages of media fusion. Obtaining consistent and complementary information among multiple modalities through media convergence can provide technical support for clustering. This article presents an approach based on Media Convergence and Graph convolution Encoder Clustering (MCGEC) for traditonal Chinese medicine (TCM) clinical data. It feeds modal information and graph structure from media information into a multiā€modal graph convolution encoder to obtain the media feature representation learnt from multiple modalities. MCGEC captures latent information from various modalities by fusion and optimises the feature representations and network architecture with learnt clustering labels. The experiment is conducted on realā€world multiā€modal TCM clinical data, including information like images and text. MCGEC has improved clustering results compared to the generic singleā€modal clustering methods and the current more advanced multiā€modal clustering methods. MCGEC applied to TCM clinical datasets can achieve better results. Integrating multimedia features into clustering algorithms offers significant benefits compared to singleā€modal clustering approaches that simply concatenate features from different modalities. It provides practical technical support for multiā€modal clustering in the TCM field incorporating multimedia features
    corecore