25 research outputs found

    Improved Pre-miRNAs Identification Through Mutual Information of Pre-miRNA Sequences and Structures

    Get PDF
    Playing critical roles as post-transcriptional regulators, microRNAs (miRNAs) are a family of short non-coding RNAs that are derived from longer transcripts called precursor miRNAs (pre-miRNAs). Experimental methods to identify pre-miRNAs are expensive and time-consuming, which presents the need for computational alternatives. In recent years, the accuracy of computational methods to predict pre-miRNAs has been increasing significantly. However, there are still several drawbacks. First, these methods usually only consider base frequencies or sequence information while ignoring the information between bases. Second, feature extraction methods based on secondary structures usually only consider the global characteristics while ignoring the mutual influence of the local structures. Third, methods integrating high-dimensional feature information is computationally inefficient. In this study, we have proposed a novel mutual information-based feature representation algorithm for pre-miRNA sequences and secondary structures, which is capable of catching the interactions between sequence bases and local features of the RNA secondary structure. In addition, the feature space is smaller than that of most popular methods, which makes our method computationally more efficient than the competitors. Finally, we applied these features to train a support vector machine model to predict pre-miRNAs and compared the results with other popular predictors. As a result, our method outperforms others based on both 5-fold cross-validation and the Jackknife test

    Novel and favorable QTL allele clusters for end-use quality revealed by introgression lines derived from synthetic wheat

    No full text
    Wheat quality is an important target trait. Previous studies mainly focus on storage protein, but their contribution to quality is partial, and most loci for quality are still undetected. Wild species of wheat are valuable resources for wheat improvement and introgression lines (ILs) are the ideal materials for detecting quantitative trait loci (QTL). In this study, a set of 82 BC5 F2-6 ILs, carrying a range of introgressed chromosome segments from a synthetic hexaploid wheat Am3 (Triticum carthlicum x Aegilops tauschii), was developed and genotyped with 170 microsatellite markers. QTL analysis was performed for 14 parameters, sodium dodecyl sulfate sedimentation volume, grain protein content (GPC), grain hardness and 11 mixograph parameters, associated with end-use quality of wheat, using the materials harvested in three environments. This led to the detection of 116 QTL, with c. 95% of the positive alleles contributed by Am3. Six important and novel genomic regions for bread-making quality were found on chromosomes 2D, 3A, 4A, 4B, 5A and 6A. These loci for bread-making quality showed pleiotropy and had large positive effects on several quality parameters with no or very weak negative effect on grain yield, thus demonstrating the value of synthetic wheat as a source of useful genetic variation for the improvement of bread wheat quality

    Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features

    No full text
    The prediction of protein subcellular localization is critical for inferring protein functions, gene regulations and protein-protein interactions. With the advances of high-throughput sequencing technologies and proteomic methods, the protein sequences of numerous yeasts have become publicly available, which enables us to computationally predict yeast protein subcellular localization. However, widely-used protein sequence representation techniques, such as amino acid composition and the Chou’s pseudo amino acid composition (PseAAC), are difficult in extracting adequate information about the interactions between residues and position distribution of each residue. Therefore, it is still urgent to develop novel sequence representations. In this study, we have presented two novel protein sequence representation techniques including Generalized Chaos Game Representation (GCGR) based on the frequency and distributions of the residues in the protein primary sequence, and novel statistics and information theory (NSI) reflecting local position information of the sequence. In the GCGR + NSI representation, a protein primary sequence is simply represented by a 5-dimensional feature vector, while other popular methods like PseAAC and dipeptide adopt features of more than hundreds of dimensions. In practice, the feature representation is highly efficient in predicting protein subcellular localization. Even without using machine learning-based classifiers, a simple model based on the feature vector can achieve prediction accuracies of 0.8825 and 0.7736 respectively for the CL317 and ZW225 datasets. To further evaluate the effectiveness of the proposed encoding schemes, we introduce a multi-view features-based method to combine the two above-mentioned features with other well-known features including PseAAC and dipeptide composition, and use support vector machine as the classifier to predict protein subcellular localization. This novel model achieves prediction accuracies of 0.927 and 0.871 respectively for the CL317 and ZW225 datasets, better than other existing methods in the jackknife tests. The results suggest that the GCGR and NSI features are useful complements to popular protein sequence representations in predicting yeast protein subcellular localization. Finally, we validate a few newly predicted protein subcellular localizations by evidences from some published articles in authority journals and books

    MDIG-mediated H3K9me3 demethylation upregulates Myc by activating OTX2 and facilitates liver regeneration

    No full text
    Abstract The mineral dust-induced gene (MDIG) comprises a conserved JmjC domain and has the ability to demethylate histone H3 lysine 9 trimethylation (H3K9me3). Previous studies have indicated the significance of MDIG in promoting cell proliferation by modulating cell-cycle transition. However, its involvement in liver regeneration has not been extensively investigated. In this study, we generated mice with liver-specific knockout of MDIG and applied partial hepatectomy or carbon tetrachloride mouse models to investigate the biological contribution of MDIG in liver regeneration. The MDIG levels showed initial upregulation followed by downregulation as the recovery progressed. Genetic MDIG deficiency resulted in dramatically impaired liver regeneration and delayed cell cycle progression. However, the MDIG-deleted liver was eventually restored over a long latency. RNA-seq analysis revealed Myc as a crucial effector downstream of MDIG. However, ATAC-seq identified the reduced chromatin accessibility of OTX2 locus in MDIG-ablated regenerating liver, with unaltered chromatin accessibility of Myc locus. Mechanistically, MDIG altered chromatin accessibility to allow transcription by demethylating H3K9me3 at the OTX2 promoter region. As a consequence, the transcription factor OTX2 binding at the Myc promoter region was decreased in MDIG-deficient hepatocytes, which in turn repressed Myc expression. Reciprocally, Myc enhanced MDIG expression by regulating MDIG promoter activity, forming a positive feedback loop to sustain hepatocyte proliferation. Altogether, our results prove the essential role of MDIG in facilitating liver regeneration via regulating histone methylation to alter chromatin accessibility and provide valuable insights into the epi-transcriptomic regulation during liver regeneration

    Transcriptomic and metabolomic analysis provides insights into lignin biosynthesis and accumulation and differences in lodging resistance in hybrid wheat

    No full text
    The use of hybrid wheat is one way to improve the yield in the future. However, greater plant heights increase lodging risk to some extent. In this study, two hybrid combinations with differences in lodging resistance were used to analyze the stem-related traits during the filling stage, and to investigate the mechanism of the difference in lodging resistance by analyzing lignin synthesis of the basal second internode (BSI). The stem-related traits such as the breaking strength, stem pole substantial degree (SPSD), and rind penetration strength (RPS), as well as the lignin content of the lodging-resistant combination (LRC), were significantly higher than those of the lodging-sensitive combination (LSC). The phenylpropanoid biosynthesis pathway was significantly and simultaneously enriched according to the transcriptomics and metabolomics analysis at the later filling stage. A total of 35 critical regulatory genes involved in the phenylpropanoid pathway were identified. Moreover, 42% of the identified genes were significantly and differentially expressed at the later grain-filling stage between the two combinations, among which more than 80% were strongly up-regulated at that stage in the LRC compared with LSC. On the contrary, the LRC displayed lower contents of lignin intermediate metabolites than the LSC. These results suggested that the key to the lodging resistance formation of LRC is largely the higher lignin synthesis at the later grain-filling stage. Finally, breeding strategies for synergistically improving plant height and lodging resistance of hybrid wheat were put forward by comparing the LRC with the conventional wheat applied in large areas
    corecore