42 research outputs found

    Application of regulatory sequence analysis and metabolic network analysis to the interpretation of gene expression data

    Get PDF
    We present two complementary approaches for the interpretation of clusters of co-regulated genes, such as those obtained from DNA chips and related methods. Starting from a cluster of genes with similar expression profiles, two basic questions can be asked: 1. Which mechanism is responsible for the coordinated transcriptional response of the genes? This question is approached by extracting motifs that are shared between the upstream sequences of these genes. The motifs extracted are putative cis-acting regulatory elements. 2. What is the physiological meaning for the cell to express together these genes? One way to answer the question is to search for potential metabolic pathways that could be catalyzed by the products of the genes. This can be done by selecting the genes from the cluster that code for enzymes, and trying to assemble the catalyzed reactions to form metabolic pathways. We present tools to answer these two questions, and we illustrate their use with selected examples in the yeast Saccharomyces cerevisiae. The tools are available on the web (http://ucmb.ulb.ac.be/bioinformatics/rsa-tools/; http://www.ebi.ac.uk/research/pfbp/; http://www.soi.city.ac.uk/~msch/)

    Semi-supervised prediction of protein interaction sentences exploiting semantically encoded metrics

    Get PDF
    Protein-protein interaction (PPI) identification is an integral component of many biomedical research and database curation tools. Automation of this task through classification is one of the key goals of text mining (TM). However, labelled PPI corpora required to train classifiers are generally small. In order to overcome this sparsity in the training data, we propose a novel method of integrating corpora that do not contain relevance judgements. Our approach uses a semantic language model to gather word similarity from a large unlabelled corpus. This additional information is integrated into the sentence classification process using kernel transformations and has a re-weighting effect on the training features that leads to an 8% improvement in F-score over the baseline results. Furthermore, we discover that some words which are generally considered indicative of interactions are actually neutralised by this process

    Infinite-Order Percolation and Giant Fluctuations in a Protein Interaction Network

    Full text link
    We investigate a model protein interaction network whose links represent interactions between individual proteins. This network evolves by the functional duplication of proteins, supplemented by random link addition to account for mutations. When link addition is dominant, an infinite-order percolation transition arises as a function of the addition rate. In the opposite limit of high duplication rate, the network exhibits giant structural fluctuations in different realizations. For biologically-relevant growth rates, the node degree distribution has an algebraic tail with a peculiar rate dependence for the associated exponent.Comment: 4 pages, 2 figures, 2 column revtex format, to be submitted to PRL 1; reference added and minor rewording of the first paragraph; Title change and major reorganization (but no result changes) in response to referee comments; to be published in PR

    Classification of protein interaction sentences via gaussian processes

    Get PDF
    The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a non-parametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and na\"ive Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption

    But what does that gene do?

    No full text

    A role for central spindle proteins in cilia structure and function

    No full text
    Cytokinesis and ciliogenesis are fundamental cellular processes that require strict coordination of microtubule organization and directed membrane trafficking. These processes have been intensely studied, but there has been little indication that regulatory machinery might be extensively shared between them. Here, we show that several central spindle/midbody proteins (PRC1, MKLP-1, INCENP, centriolin) also localize in specific patterns at the basal body complex in vertebrate ciliated epithelial cells. Moreover, bioinformatic comparisons of midbody and cilia proteomes reveal a highly significant degree of overlap. Finally, we used temperature-sensitive alleles of PRC1/spd-1 and MKLP-1/zen-4 in C. elegans to assess ciliary functions while bypassing these proteins' early role in cell division. These mutants displayed defects in both cilia function and cilia morphology. Together, these data suggest the conserved reuse of a surprisingly large number of proteins in the cytokinetic apparatus and in cilia

    Finding all common intervals of k permutations

    No full text
    1 Introduction Let \Pi = (ss1; : : : ; ssk) be a family of k permutations of N = f1; 2; : : : ; ng. A k-tuple of intervals of these permutations consisting of the same set of elements is called a common interval
    corecore