48 research outputs found

    Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme

    Get PDF
    BACKGROUND: Gene expression profiling has become a useful biological resource in recent years, and it plays an important role in a broad range of areas in biology. The raw gene expression data, usually in the form of large matrix, may contain missing values. The downstream analysis methods that postulate complete matrix input are thus not applicable. Several methods have been developed to solve this problem, such as K nearest neighbor impute method, Bayesian principal components analysis impute method, etc. In this paper, we introduce a novel imputing approach based on the Support Vector Regression (SVR) method. The proposed approach utilizes an orthogonal coding input scheme, which makes use of multi-missing values in one row of a certain gene expression profile and imputes the missing value into a much higher dimensional space, to obtain better performance. RESULTS: A comparative study of our method with the previously developed methods has been presented for the estimation of the missing values on six gene expression data sets. Among the three different input-vector coding schemes we tried, the orthogonal input coding scheme obtains the best estimation results with the minimum Normalized Root Mean Squared Error (NRMSE). The results also demonstrate that the SVR method has powerful estimation ability on different kinds of data sets with relatively small NRMSE. CONCLUSION: The SVR impute method shows better performance than, or at least comparable with, the previously developed methods in present research. The outstanding estimation ability of this impute method is partly due to the use of the most missing value information by incorporating orthogonal input coding scheme. In addition, the solid theoretical foundation of SVR method also helps in estimation of performance together with orthogonal input coding scheme. The promising estimation ability demonstrated in the results section suggests that the proposed approach provides a proper solution to the missing value estimation problem. The source code of the SVR method is available from for non-commercial use

    PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory

    Get PDF
    BACKGROUND: As a reversible and dynamic post-translational modification (PTM) of proteins, phosphorylation plays essential regulatory roles in a broad spectrum of the biological processes. Although many studies have been contributed on the molecular mechanism of phosphorylation dynamics, the intrinsic feature of substrates specificity is still elusive and remains to be delineated. RESULTS: In this work, we present a novel, versatile and comprehensive program, PPSP (Prediction of PK-specific Phosphorylation site), deployed with approach of Bayesian decision theory (BDT). PPSP could predict the potential phosphorylation sites accurately for ~70 PK (Protein Kinase) groups. Compared with four existing tools Scansite, NetPhosK, KinasePhos and GPS, PPSP is more accurate and powerful than these tools. Moreover, PPSP also provides the prediction for many novel PKs, say, TRK, mTOR, SyK and MET/RON, etc. The accuracy of these novel PKs are also satisfying. CONCLUSION: Taken together, we propose that PPSP could be a potentially powerful tool for the experimentalists who are focusing on phosphorylation substrates with their PK-specific sites identification. Moreover, the BDT strategy could also be a ubiquitous approach for PTMs, such as sumoylation and ubiquitination, etc

    LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST

    Get PDF
    Subcellular location of a protein is one of the key functional characters as proteins must be localized correctly at the subcellular level to have normal biological function. In this paper, a novel method named LOCSVMPSI has been introduced, which is based on the support vector machine (SVM) and the position-specific scoring matrix generated from profiles of PSI-BLAST. With a jackknife test on the RH2427 data set, LOCSVMPSI achieved a high overall prediction accuracy of 90.2%, which is higher than the prediction results by SubLoc and ESLpred on this data set. In addition, prediction performance of LOCSVMPSI was evaluated with 5-fold cross validation test on the PK7579 data set and the prediction results were consistently better than the previous method based on several SVMs using composition of both amino acids and amino acid pairs. Further test on the SWISSPROT new-unique data set showed that LOCSVMPSI also performed better than some widely used prediction methods, such as PSORTII, TargetP and LOCnet. All these results indicate that LOCSVMPSI is a powerful tool for the prediction of eukaryotic protein subcellular localization. An online web server (current version is 1.3) based on this method has been developed and is freely available to both academic and commercial users, which can be accessed by at

    Ghrelin Regulates Cyclooxygenase-2 Expression and Promotes Gastric Cancer Cell Progression

    No full text
    Aim. To research the molecular mechanism of ghrelin in apoptosis, migratory, and invasion of gastric cancer (GC) cells. Methods. After GC AGS cells were handled with ghrelin (10–8 M), cyclooxygenase-2 inhibitor NS398 (100 μM), and Akt inhibitor perifosine (10uM), the rates of apoptosis were detected by TUNEL assay and flow cytometry assay. We assessed the expressions of PI3K, p-Akt, and COX-2 proteins by making use of Western blot analysis. The cell migratory and invasion were detected by using wound-healing and transwell analysis. Results. The migratory and invasion were increased in ghrelin-treated cells, while the rates of apoptosis were decreased. GC AGS cells treated with ghrelin showed an increase in protein expression of p-Akt, PI3K, and COX-2. After cells were treated with Akt inhibitor perifosine, the protein expression of p-Akt, PI3K, and COX-2 and the cell migratory, invasion, and apoptosis were partly recovered. After cells were treated with cyclooxygenase-2 inhibitor NS398, the protein expression of COX-2 and the cell migratory and invasion were decreased, while the rates of apoptosis were increased. Conclusion. Ghrelin regulates cell migration, invasion, and apoptosis in GC cells through targeting PI3K/Akt/COX-2. Ghrelin increases the expression of COX-2 in GC cells by targeting PI3K/Akt. Ghrelin is suggested to be one of the molecular targets in GC

    Reconstruction and analysis of transcription factor-miRNA co-regulatory feed-forward loops in human cancers using filter-wrapper feature selection.

    Get PDF
    BACKGROUND: As one of the most common types of co-regulatory motifs, feed-forward loops (FFLs) control many cell functions and play an important role in human cancers. Therefore, it is crucial to reconstruct and analyze cancer-related FFLs that are controlled by transcription factor (TF) and microRNA (miRNA) simultaneously, in order to find out how miRNAs and TFs cooperate with each other in cancer cells and how they contribute to carcinogenesis. Current FFL studies rely on predicted regulation information and therefore suffer the false positive issue in prediction results. More critically, FFLs generated by existing approaches cannot represent the dynamic and conditional regulation relationship under different experimental conditions. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we proposed a novel filter-wrapper feature selection method to accurately identify co-regulatory mechanism by incorporating prior information from predicted regulatory interactions with parallel miRNA/mRNA expression datasets. By applying this method, we reconstructed 208 and 110 TF-miRNA co-regulatory FFLs from human pan-cancer and prostate datasets, respectively. Further analysis of these cancer-related FFLs showed that the top-ranking TF STAT3 and miRNA hsa-let-7e are key regulators implicated in human cancers, which have regulated targets significantly enriched in cellular process regulations and signaling pathways that are involved in carcinogenesis. CONCLUSIONS/SIGNIFICANCE: In this study, we introduced an efficient computational approach to reconstruct co-regulatory FFLs by accurately identifying gene co-regulatory interactions. The strength of the proposed feature selection method lies in the fact it can precisely filter out false positives in predicted regulatory interactions by quantitatively modeling the complex co-regulation of target genes mediated by TFs and miRNAs simultaneously. Moreover, the proposed feature selection method can be generally applied to other gene regulation studies using parallel expression data with respect to different biological contexts

    TAFFYS: An Integrated Tool for Comprehensive Analysis of Genomic Aberrations in Tumor Samples

    No full text
    <div><p>Background</p><p>Tumor single nucleotide polymorphism (SNP) array is a common platform for investigating the cancer genomic aberration and the functionally important altered genes. Original SNP array signals are usually corrupted by noise, and need to be de-convoluted into absolute copy number profile by analytical methods. Unfortunately, in contrast with the popularity of tumor Affymetrix SNP array, the methods that are specifically designed for this platform are still limited. The complicated characteristics of noise in signals is one of the difficulties for dissecting tumor Affymetrix SNP array data, as they inevitably blur the distinction between aberrations and create an obstacle for the copy number aberration (CNA) identification.</p><p>Results</p><p>We propose a tool named TAFFYS for comprehensive analysis of tumor Affymetrix SNP array data. TAFFYS introduce a wavelet-based de-noising approach and copy number-specific signal variance model for suppressing and modelling the noise in signals. Then a hidden Markov model is employed for copy number inference. Finally, by using the absolute copy number profile, statistical significance of each aberration region is calculated in term of different aberration types, including amplification, deletion and loss of heterozygosity (LOH). The result shows that copy number specific-variance model and wavelet de-noising algorithm fits well with the Affymetrix SNP array signals, leading to more accurate estimation for diluted tumor sample (even with only 30% of cancer cells) than other existed methods. Results of examinations also demonstrate a good compatibility and extensibility for different Affymetrix SNP array platforms. Application on the 35 breast tumor samples shows that TAFFYS can automatically dissect the tumor samples and reveal statistically significant aberration regions where cancer-related genes locate.</p><p>Conclusions</p><p>TAFFYS provide an efficient and convenient tool for identifying the copy number alteration and allelic imbalance and assessing the recurrent aberrations for the tumor Affymetrix SNP array data.</p></div

    Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme"</p><p>BMC Bioinformatics 2006;7():32-32.</p><p>Published online 22 Jan 2006</p><p>PMCID:PMC1403803.</p><p></p>l axes, respectively
    corecore