125 research outputs found

    Multivariate Information Fusion With Fast Kernel Learning to Kernel Ridge Regression in Predicting LncRNA-Protein Interactions

    Get PDF
    Long non-coding RNAs (lncRNAs) constitute a large class of transcribed RNA molecules. They have a characteristic length of more than 200 nucleotides which do not encode proteins. They play an important role in regulating gene expression by interacting with the homologous RNA-binding proteins. Due to the laborious and time-consuming nature of wet experimental methods, more researchers should pay great attention to computational approaches for the prediction of lncRNA-protein interaction (LPI). An in-depth literature review in the state-of-the-art in silico investigations, leads to the conclusion that there is still room for improving the accuracy and velocity. This paper propose a novel method for identifying LPI by employing Kernel Ridge Regression, based on Fast Kernel Learning (LPI-FKLKRR). This approach, uses four distinct similarity measures for lncRNA and protein space, respectively. It is remarkable, that we extract Gene Ontology (GO) with proteins, in order to improve the quality of information in protein space. The process of heterogeneous kernels integration, applies Fast Kernel Learning (FastKL) to deal with weight optimization. The extrapolation model is obtained by gaining the ultimate prediction associations, after using Kernel Ridge Regression (KRR). Experimental outcomes show that the ability of modeling with LPI-FKLKRR has extraordinary performance compared with LPI prediction schemes. On benchmark dataset, it has been observed that the best Area Under Precision Recall Curve (AUPR) of 0.6950 is obtained by our proposed model LPI-FKLKRR, which outperforms the integrated LPLNP (AUPR: 0.4584), RWR (AUPR: 0.2827), CF (AUPR: 0.2357), LPIHN (AUPR: 0.2299), and LPBNI (AUPR: 0.3302). Also, combined with the experimental results of a case study on a novel dataset, it is anticipated that LPI-FKLKRR will be a useful tool for LPI prediction

    FKRR-MVSF: A Fuzzy Kernel Ridge Regression Model for Identifying DNA-Binding Proteins by Multi-View Sequence Features via Chou\u27s Five-Step Rule

    Get PDF
    DNA-binding proteins play an important role in cell metabolism. In biological laboratories, the detection methods of DNA-binding proteins includes yeast one-hybrid methods, bacterial singles and X-ray crystallography methods and others, but these methods involve a lot of labor, material and time. In recent years, many computation-based approachs have been proposed to detect DNA-binding proteins. In this paper, a machine learning-based method, which is called the Fuzzy Kernel Ridge Regression model based on Multi-View Sequence Features (FKRR-MVSF), is proposed to identifying DNA-binding proteins. First of all, multi-view sequence features are extracted from protein sequences. Next, a Multiple Kernel Learning (MKL) algorithm is employed to combine multiple features. Finally, a Fuzzy Kernel Ridge Regression (FKRR) model is built to detect DNA-binding proteins. Compared with other methods, our model achieves good results. Our method obtains an accuracy of 83.26% and 81.72% on two benchmark datasets (PDB1075 and compared with PDB186), respectively

    MIPS-Fusion: Multi-Implicit-Submaps for Scalable and Robust Online Neural RGB-D Reconstruction

    Full text link
    We introduce MIPS-Fusion, a robust and scalable online RGB-D reconstruction method based on a novel neural implicit representation -- multi-implicit-submap. Different from existing neural RGB-D reconstruction methods lacking either flexibility with a single neural map or scalability due to extra storage of feature grids, we propose a pure neural representation tackling both difficulties with a divide-and-conquer design. In our method, neural submaps are incrementally allocated alongside the scanning trajectory and efficiently learned with local neural bundle adjustments. The submaps can be refined individually in a back-end optimization and optimized jointly to realize submap-level loop closure. Meanwhile, we propose a hybrid tracking approach combining randomized and gradient-based pose optimizations. For the first time, randomized optimization is made possible in neural tracking with several key designs to the learning process, enabling efficient and robust tracking even under fast camera motions. The extensive evaluation demonstrates that our method attains higher reconstruction quality than the state of the arts for large-scale scenes and under fast camera motions

    Multivariate Information Fusion With Fast Kernel Learning to Kernel Ridge Regression in Predicting LncRNA-Protein Interactions

    Get PDF
    Long non-coding RNAs (lncRNAs) constitute a large class of transcribed RNA molecules. They have a characteristic length of more than 200 nucleotides which do not encode proteins. They play an important role in regulating gene expression by interacting with the homologous RNA-binding proteins. Due to the laborious and time-consuming nature of wet experimental methods, more researchers should pay great attention to computational approaches for the prediction of lncRNA-protein interaction (LPI). An in-depth literature review in the state-of-the-art in silico investigations, leads to the conclusion that there is still room for improving the accuracy and velocity. This paper propose a novel method for identifying LPI by employing Kernel Ridge Regression, based on Fast Kernel Learning (LPI-FKLKRR). This approach, uses four distinct similarity measures for lncRNA and protein space, respectively. It is remarkable, that we extract Gene Ontology (GO) with proteins, in order to improve the quality of information in protein space. The process of heterogeneous kernels integration, applies Fast Kernel Learning (FastKL) to deal with weight optimization. The extrapolation model is obtained by gaining the ultimate prediction associations, after using Kernel Ridge Regression (KRR). Experimental outcomes show that the ability of modeling with LPI-FKLKRR has extraordinary performance compared with LPI prediction schemes. On benchmark dataset, it has been observed that the best Area Under Precision Recall Curve (AUPR) of 0.6950 is obtained by our proposed model LPI-FKLKRR, which outperforms the integrated LPLNP (AUPR: 0.4584), RWR (AUPR: 0.2827), CF (AUPR: 0.2357), LPIHN (AUPR: 0.2299), and LPBNI (AUPR: 0.3302). Also, combined with the experimental results of a case study on a novel dataset, it is anticipated that LPI-FKLKRR will be a useful tool for LPI prediction

    MDA-SKF: Similarity Kernel Fusion for Accurately Discovering miRNA-Disease Association

    Get PDF
    Identifying accurate associations between miRNAs and diseases is beneficial for diagnosis and treatment of human diseases. It is especially important to develop an efficient method to detect the association between miRNA and disease. Traditional experimental method has high precision, but its process is complicated and time-consuming. Various computational methods have been developed to uncover potential associations based on an assumption that similar miRNAs are always related to similar diseases. In this paper, we propose an accurate method, MDA-SKF, to uncover potential miRNA-disease associations. We first extract three miRNA similarity kernels (miRNA functional similarity, miRNA sequence similarity, Hamming profile similarity for miRNA) and three disease similarity kernels (disease semantic similarity, disease functional similarity, Hamming profile similarity for disease) in two subspaces, respectively. Then, due to limitations that some initial information may be lost in the process and some noises may be exist in integrated similarity kernel, we propose a novel Similarity Kernel Fusion (SKF) method to integrate multiple similarity kernels. Finally, we utilize the Laplacian Regularized Least Squares (LapRLS) method on the integrated kernel to find potential associations. MDA-SKF is evaluated by three evaluation methods, including global leave-one-out cross validation (LOOCV) and local LOOCV and 5-fold cross validation (CV), and achieves AUCs of 0.9576, 0.8356, and 0.9557, respectively. Compared with existing seven methods, MDA-SKF has outstanding performance on global LOOCV and 5-fold. We also test case studies to further analyze the performance of MDA-SKF on 32 diseases. Furthermore, 3200 candidate associations are obtained and a majority of them can be confirmed. It demonstrates that MDA-SKF is an accurate and efficient computational tool for guiding traditional experiments

    Discovering Cancer Subtypes via an Accurate Fusion Strategy on Multiple Profile Data

    Get PDF
    Discovering cancer subtypes is useful for guiding clinical treatment of multiple cancers. Progressive profile technologies for tissue have accumulated diverse types of data. Based on these types of expression data, various computational methods have been proposed to predict cancer subtypes. It is crucial to study how to better integrate these multiple profiles of data. In this paper, we collect multiple profiles of data for five cancers on The Cancer Genome Atlas (TCGA). Then, we construct three similarity kernels for all patients of the same cancer by gene expression, miRNA expression and isoform expression data. We also propose a novel unsupervised multiple kernel fusion method, Similarity Kernel Fusion (SKF), in order to integrate three similarity kernels into one combined kernel. Finally, we make use of spectral clustering on the integrated kernel to predict cancer subtypes. In the experimental results, the P-values from the Cox regression model and survival curve analysis can be used to evaluate the performance of predicted subtypes on three datasets. Our kernel fusion method, SKF, has outstanding performance compared with single kernel and other multiple kernel fusion strategies. It demonstrates that our method can accurately identify more accurate subtypes on various kinds of cancers. Our cancer subtype prediction method can identify essential genes and biomarkers for disease diagnosis and prognosis, and we also discuss the possible side effects of therapies and treatment

    Generation of 100 m, Hourly Land Surface Temperature Based on Spatio-Temporal Fusion

    Get PDF
    Landsat surface temperature (LST) is an important physical quantity for global climate change monitoring. Over the past decades, several LST products have been produced by satellite thermal infrared (TIR) bands or land surface models (LSMs). Recent research has increased the spatio-temporal resolution of LST products to 2 km, hourly based on Geostationary Operational Environmental Satellites (GOES)-R Advanced Baseline Imager (ABI) LST data. The spatial resolution of 2 km, however, is insufficient for monitoring at the regional scale. This paper investigates the feasibility of applying spatio-temporal fusion to generate reliable 100 m, hourly LST data based on fusion of the newly released 2 km, hourly GOES-16 ABI LST and 100 m Landsat LST data. The most accurate fusion method was identified through a comparison between several popular methods. Furthermore, a comprehensive comparison was performed between fusion (with Landsat LST) involving satellite-derived LST (i.e., GOES) and model-derived LSMs (i.e., European Centre for Medium-range Weather Forecasts (ECMWF) Reanalysis v .5 (ERA5)-Land). The spatial and temporal adaptive reflectance fusion model (STARFM) method was demonstrated to be an appropriate method to generate 100 m, hourly data, which produced an average root mean square error (RMSE) of 2.640 K, mean absolute error (MAE) of 2.159 K and average coefficient of determination ( R 2 ) of 0.982 referring to the in situ time-series. Furthermore, inheriting the advantages of direct observation, and the fusion of Landsat and GOES for the generation of 100 m, hourly LST produced greater accuracy compared to the fusion of Landsat and ERA5-Land LST in the experiments. The generated 100 m, hourly LST can provide important diurnal data with fine spatial resolution for various monitoring applications

    Identification of wounding and topping responsive small RNAs in tobacco (Nicotiana tabacum)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) and short interfering RNAs (siRNAs) are two major classes of small RNAs. They play important regulatory roles in plants and animals by regulating transcription, stability and/or translation of target genes in a sequence-complementary dependent manner. Over 4,000 miRNAs and several classes of siRNAs have been identified in plants, but in tobacco only computational prediction has been performed and no tobacco-specific miRNA has been experimentally identified. Wounding is believed to induce defensive response in tobacco, but the mechanism responsible for this response is yet to be uncovered.</p> <p>Results</p> <p>To get insight into the role of small RNAs in damage-induced responses, we sequenced and analysed small RNA populations in roots and leaves from wounding or topping treated tobacco plants. In addition to confirmation of expression of 27 known miRNA families, we identified 59 novel tobacco-specific miRNA members of 38 families and a large number of loci generating phased 21- or 24-nt small RNAs (including ta-siRNAs). A number of miRNAs and phased small RNAs were found to be responsive to wounding or topping treatment. Targets of small RNAs were further surveyed by degradome sequencing.</p> <p>Conclusions</p> <p>The expression changes of miRNAs and phased small RNAs responsive to wounding or topping and identification of defense related targets for these small RNAs suggest that the inducible defense response in tobacco might be controlled by pathways involving small RNAs.</p

    Discovering Cancer Subtypes via an Accurate Fusion Strategy on Multiple Profile Data

    Get PDF
    Discovering cancer subtypes is useful for guiding clinical treatment of multiple cancers. Progressive profile technologies for tissue have accumulated diverse types of data. Based on these types of expression data, various computational methods have been proposed to predict cancer subtypes. It is crucial to study how to better integrate these multiple profiles of data. In this paper, we collect multiple profiles of data for five cancers on The Cancer Genome Atlas (TCGA). Then, we construct three similarity kernels for all patients of the same cancer by gene expression, miRNA expression and isoform expression data. We also propose a novel unsupervised multiple kernel fusion method, Similarity Kernel Fusion (SKF), in order to integrate three similarity kernels into one combined kernel. Finally, we make use of spectral clustering on the integrated kernel to predict cancer subtypes. In the experimental results, the P-values from the Cox regression model and survival curve analysis can be used to evaluate the performance of predicted subtypes on three datasets. Our kernel fusion method, SKF, has outstanding performance compared with single kernel and other multiple kernel fusion strategies. It demonstrates that our method can accurately identify more accurate subtypes on various kinds of cancers. Our cancer subtype prediction method can identify essential genes and biomarkers for disease diagnosis and prognosis, and we also discuss the possible side effects of therapies and treatment
    • ā€¦
    corecore