47 research outputs found

    Word correlation matrices for protein sequence analysis and remote homology detection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Classification of protein sequences is a central problem in computational biology. Currently, among computational methods discriminative kernel-based approaches provide the most accurate results. However, kernel-based methods often lack an interpretable model for analysis of discriminative sequence features, and predictions on new sequences usually are computationally expensive.</p> <p>Results</p> <p>In this work we present a novel kernel for protein sequences based on average word similarity between two sequences. We show that this kernel gives rise to a feature space that allows analysis of discriminative features and fast classification of new sequences. We demonstrate the performance of our approach on a widely-used benchmark setup for protein remote homology detection.</p> <p>Conclusion</p> <p>Our word correlation approach provides highly competitive performance as compared with state-of-the-art methods for protein remote homology detection. The learned model is interpretable in terms of biologically meaningful features. In particular, analysis of discriminative words allows the identification of characteristic regions in biological sequences. Because of its high computational efficiency, our method can be applied to ranking of potential homologs in large databases.</p

    Physicochemical property distributions for accurate and rapid pairwise protein homology detection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The challenge of remote homology detection is that many evolutionarily related sequences have very little similarity at the amino acid level. Kernel-based discriminative methods, such as support vector machines (SVMs), that use vector representations of sequences derived from sequence properties have been shown to have superior accuracy when compared to traditional approaches for the task of remote homology detection.</p> <p>Results</p> <p>We introduce a new method for feature vector representation based on the physicochemical properties of the primary protein sequence. A distribution of physicochemical property scores are assembled from 4-mers of the sequence and normalized based on the null distribution of the property over all possible 4-mers. With this approach there is little computational cost associated with the transformation of the protein into feature space, and overall performance in terms of remote homology detection is comparable with current state-of-the-art methods. We demonstrate that the features can be used for the task of pairwise remote homology detection with improved accuracy versus sequence-based methods such as BLAST and other feature-based methods of similar computational cost.</p> <p>Conclusions</p> <p>A protein feature method based on physicochemical properties is a viable approach for extracting features in a computationally inexpensive manner while retaining the sensitivity of SVM protein homology detection. Furthermore, identifying features that can be used for generic pairwise homology detection in lieu of family-based homology detection is important for applications such as large database searches and comparative genomics.</p

    A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein remote homology detection and fold recognition are central problems in bioinformatics. Currently, discriminative methods based on support vector machine (SVM) are the most effective and accurate methods for solving these problems. A key step to improve the performance of the SVM-based methods is to find a suitable representation of protein sequences.</p> <p>Results</p> <p>In this paper, a novel building block of proteins called Top-<it>n</it>-grams is presented, which contains the evolutionary information extracted from the protein sequence frequency profiles. The protein sequence frequency profiles are calculated from the multiple sequence alignments outputted by PSI-BLAST and converted into Top-<it>n</it>-grams. The protein sequences are transformed into fixed-dimension feature vectors by the occurrence times of each Top-<it>n</it>-gram. The training vectors are evaluated by SVM to train classifiers which are then used to classify the test protein sequences. We demonstrate that the prediction performance of remote homology detection and fold recognition can be improved by combining Top-<it>n</it>-grams and latent semantic analysis (LSA), which is an efficient feature extraction technique from natural language processing. When tested on superfamily and fold benchmarks, the method combining Top-<it>n</it>-grams and LSA gives significantly better results compared to related methods.</p> <p>Conclusion</p> <p>The method based on Top-<it>n</it>-grams significantly outperforms the methods based on many other building blocks including N-grams, patterns, motifs and binary profiles. Therefore, Top-<it>n</it>-gram is a good building block of the protein sequences and can be widely used in many tasks of the computational biology, such as the sequence alignment, the prediction of domain boundary, the designation of knowledge-based potentials and the prediction of protein binding sites.</p

    Periostin Responds to Mechanical Stress and Tension by Activating the MTOR Signaling Pathway

    Get PDF
    Current knowledge about Periostin biology has expanded from its recognized functions in embryogenesis and bone metabolism to its roles in tissue repair and remodeling and its clinical implications in cancer. Emerging evidence suggests that Periostin plays a critical role in the mechanism of wound healing; however, the paracrine effect of Periostin in epithelial cell biology is still poorly understood. We found that epithelial cells are capable of producing endogenous Periostin that, unlike mesenchymal cell, cannot be secreted. Epithelial cells responded to Periostin paracrine stimuli by enhancing cellular migration and proliferation and by activating the mTOR signaling pathway. Interestingly, biomechanical stimulation of epithelial cells, which simulates tension forces that occur during initial steps of tissue healing, induced Periostin production and mTOR activation. The molecular association of Periostin and mTOR signaling was further dissected by administering rapamycin, a selective pharmacological inhibitor of mTOR, and by disruption of Raptor and Rictor scaffold proteins implicated in the regulation of mTORC1 and mTORC2 complex assembly. Both strategies resulted in ablation of Periostin-induced mitogenic and migratory activity. These results indicate that Periostin-induced epithelial migration and proliferation requires mTOR signaling. Collectively, our findings identify Periostin as a mechanical stress responsive molecule that is primarily secreted by fibroblasts during wound healing and expressed endogenously in epithelial cells resulting in the control of cellular physiology through a mechanism mediated by the mTOR signaling cascade.This work was funded by the National Institutes of Health (NIH/NCI) P50-CA97248 (University of Michigan Head and Neck SPORE)

    RNA-directed epigenetic silencing of Periostin inhibits cell motility

    No full text

    Solvability of the matrix inequality

    No full text

    The Tumor Microenvironment and Immune Milieu of Cholangiocarcinoma

    No full text
    Tumor microenvironment is a complex, multicellular functional compartment that, particularly when assembled as an abundant desmoplastic reaction, may profoundly affect the proliferative and invasive abilities of epithelial cancer cells. Tumor microenvironment comprises not only stromal cells, mainly cancer-associated fibroblasts, but also immune cells of both the innate and adaptive system (tumor-associated macrophages, neutrophils, natural killer cells, and T and B lymphocytes), and endothelial cells. This results in an intricate web of mutual communications regulated by an extensively remodeled extracellular matrix, where the tumor cells are centrally engaged. In this regard, cholangiocarcinoma, in particular the intrahepatic variant, has become the focus of mounting interest in the last years, largely due to the lack of effective therapies despite its rising incidence and high mortality rates worldwide. On the other hand, recent studies in pancreatic cancer, which similarly to cholangiocarcinoma, is highly desmoplastic, have argued against a tumor-promoting function of the tumor microenvironment. In this review, we will discuss recent developments concerning the role of each cellular population and their multifaceted interplay with the malignant biliary epithelial counterpart. We ultimately hope to provide the working knowledge on how their manipulation may lead to a therapeutic gain in cholangiocarcinoma. This article is protected by copyright. All rights reserved
    corecore