133 research outputs found

    Exact score distribution computation for ontological similarity searches

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO). We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact <it>P</it>-value of a given score.</p> <p>Results</p> <p>In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a <it>P</it>-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact <it>P</it>-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling.</p> <p>Conclusions</p> <p>The new algorithm enables for the first time exact <it>P</it>-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: <url>https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/</url></p

    Risk of infection and adverse outcomes among pregnant working women in selected occupational groups: A study in the Danish National Birth Cohort

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Exposure to infectious pathogens is a frequent occupational hazard for women who work with patients, children, animals or animal products. The purpose of the present study is to investigate if women working in occupations where exposure to infections agents is common have a high risk of infections and adverse pregnancy outcomes.</p> <p>Methods</p> <p>We used data from the Danish National Birth Cohort, a population-based cohort study and studied the risk of Infection and adverse outcomes in pregnant women working with patients, with children, with food products or with animals. The regression analysis were adjusted for the following covariates: maternal age, parity, history of miscarriage, socio-occupational status, pre-pregnancy body mass index, smoking habit, alcohol consumption.</p> <p>Results</p> <p>Pregnant women who worked with patients or children or food products had an excess risk of sick leave during pregnancy for more than three days. Most of negative reproductive outcomes were not increased in these occupations but the prevalence of congenital anomalies (CAs) was slightly higher in children of women who worked with patients. The prevalence of small for gestational age infants was higher among women who worked with food products. There was no association between occupation infections during pregnancy and the risk of reproductive failures in the exposed groups. However, the prevalence of CAs was slightly higher among children of women who suffered some infection during pregnancy but the numbers were small.</p> <p>Conclusion</p> <p>Despite preventive strategies, working in specific jobs during pregnancy may impose a higher risk of infections, and working in some of these occupations may impose a slightly higher risk of CAs in their offspring. Most other reproductive failures were not increased in these occupations.</p

    A new measure for functional similarity of gene products based on Gene Ontology

    Get PDF
    BACKGROUND: Gene Ontology (GO) is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. RESULTS: We present a new method for comparing sets of GO terms and for assessing the functional similarity of gene products. The method relies on two semantic similarity measures; sim(Rel )and funSim. One measure (sim(Rel)) is applied in the comparison of the biological processes found in different groups of organisms. The other measure (funSim) is used to find functionally related gene products within the same or between different genomes. Results indicate that the method, in addition to being in good agreement with established sequence similarity approaches, also provides a means for the identification of functionally related proteins independent of evolutionary relationships. The method is also applied to estimating functional similarity between all proteins in Saccharomyces cerevisiae and to visualizing the molecular function space of yeast in a map of the functional space. A similar approach is used to visualize the functional relationships between protein families. CONCLUSION: The approach enables the comparison of the underlying molecular biology of different taxonomic groups and provides a new comparative genomics tool identifying functionally related gene products independent of homology. The proposed map of the functional space provides a new global view on the functional relationships between gene products or protein families

    Impact Factor: outdated artefact or stepping-stone to journal certification?

    Full text link
    A review of Garfield's journal impact factor and its specific implementation as the Thomson Reuters Impact Factor reveals several weaknesses in this commonly-used indicator of journal standing. Key limitations include the mismatch between citing and cited documents, the deceptive display of three decimals that belies the real precision, and the absence of confidence intervals. These are minor issues that are easily amended and should be corrected, but more substantive improvements are needed. There are indications that the scientific community seeks and needs better certification of journal procedures to improve the quality of published science. Comprehensive certification of editorial and review procedures could help ensure adequate procedures to detect duplicate and fraudulent submissions.Comment: 25 pages, 12 figures, 6 table

    Retroviral expression of a kinase-defective IGF-I receptor suppresses growth and causes apoptosis of CHO and U87 cells in-vivo

    Get PDF
    BACKGROUND: Phosphatidylinositol-3,4,5-triphosphate (PtdInsP3) signaling is elevated in many tumors due to loss of the tumor suppressor PTEN, and leads to constitutive activation of Akt, a kinase involved in cell survival. Reintroduction of PTEN in cells suppresses transformation and tumorigenicity. While this approach works in-vitro, it may prove difficult to achieve in-vivo. In this study, we investigated whether inhibition of growth factor signaling would have the same effect as re-expression of PTEN. METHODS: Dominant negative IGF-I receptors were expressed in CHO and U87 cells by retroviral infection. Cell proliferation, transformation and tumor formation in athymic nude mice were assessed. RESULTS: Inhibition of IGF-IR signaling in a CHO cell model system by expression of a kinase-defective IGF-IR impairs proliferation, transformation and tumor growth. Reduction in tumor growth is associated with an increase in apoptosis in-vivo. The dominant-negative IGF-IRs also prevented growth of U87 PTEN-negative glioblastoma cells when injected into nude mice. Injection of an IGF-IR blocking antibody αIR3 into mice harboring parental U87 tumors inhibits tumor growth and increases apoptosis. CONCLUSION: Inhibition of an upstream growth factor signal prevents tumor growth of the U87 PTEN-deficient glioma to the same extent as re-introduction of PTEN. This result suggests that growth factor receptor inhibition may be an effective alternative therapy for PTEN-deficient tumors

    Inferring predominant pathways in cellular models of breast cancer using limited sample proteomic profiling

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Molecularly targeted drugs inhibit aberrant signaling within oncogenic pathways. Identifying the predominant pathways at work within a tumor is a key step towards tailoring therapies to the patient. Clinical samples pose significant challenges for proteomic profiling, an attractive approach for identifying predominant pathways. The objective of this study was to determine if information obtained from a limited sample (i.e., a single gel replicate) can provide insight into the predominant pathways in two well-characterized breast cancer models.</p> <p>Methods</p> <p>A comparative proteomic analysis of total cell lysates was obtained from two cellular models of breast cancer, BT474 (HER2+/ER+) and SKBR3 (HER2+/ER-), using two-dimensional electrophoresis and MALDI-TOF mass spectrometry. Protein interaction networks and canonical pathways were extracted from the Ingenuity Pathway Knowledgebase (IPK) based on association with the observed pattern of differentially expressed proteins.</p> <p>Results</p> <p>Of the 304 spots that were picked, 167 protein spots were identified. A threshold of 1.5-fold was used to select 62 proteins used in the analysis. IPK analysis suggested that metabolic pathways were highly associated with protein expression in SKBR3 cells while cell motility pathways were highly associated with BT474 cells. Inferred protein networks were confirmed by observing an up-regulation of IGF-1R and profilin in BT474 and up-regulation of Ras and enolase in SKBR3 using western blot.</p> <p>Conclusion</p> <p>When interpreted in the context of prior information, our results suggest that the overall patterns of differential protein expression obtained from limited samples can still aid in clinical decision making by providing an estimate of the predominant pathways that underpin cellular phenotype.</p

    Metrics for GO based protein semantic similarity: a systematic evaluation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Several semantic similarity measures have been applied to gene products annotated with Gene Ontology terms, providing a basis for their functional comparison. However, it is still unclear which is the best approach to semantic similarity in this context, since there is no conclusive evaluation of the various measures. Another issue, is whether electronic annotations should or not be used in semantic similarity calculations.</p> <p>Results</p> <p>We conducted a systematic evaluation of GO-based semantic similarity measures using the relationship with sequence similarity as a means to quantify their performance, and assessed the influence of electronic annotations by testing the measures in the presence and absence of these annotations. We verified that the relationship between semantic and sequence similarity is not linear, but can be well approximated by a rescaled Normal cumulative distribution function. Given that the majority of the semantic similarity measures capture an identical behaviour, but differ in resolution, we used the latter as the main criterion of evaluation.</p> <p>Conclusions</p> <p>This work has provided a basis for the comparison of several semantic similarity measures, and can aid researchers in choosing the most adequate measure for their work. We have found that the hybrid <it>simGIC</it> was the measure with the best overall performance, followed by Resnik's measure using a best-match average combination approach. We have also found that the average and maximum combination approaches are problematic since both are inherently influenced by the number of terms being combined. We suspect that there may be a direct influence of data circularity in the behaviour of the results including electronic annotations, as a result of functional inference from sequence similarity.</p

    Incorporating functional inter-relationships into protein function prediction algorithms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Functional classification schemes (e.g. the Gene Ontology) that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. While successful function prediction algorithms have been developed, few previous efforts have utilized more than the protein-to-functional class label information provided by such knowledge bases. For instance, the Gene Ontology not only captures protein annotations to a set of functional classes, but it also arranges these classes in a DAG-based hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches.</p> <p>Results</p> <p>We propose a method to enhance the performance of classification-based protein function prediction algorithms by addressing the issue of using these interrelationships between functional classes constituting functional classification schemes. Using a standard measure for evaluating the semantic similarity between nodes in an ontology, we quantify and incorporate these inter-relationships into the <it>k</it>-nearest neighbor classifier. We present experiments on several large genomic data sets, each of which is used for the modeling and prediction of over hundred classes from the GO Biological Process ontology. The results show that this incorporation produces more accurate predictions for a large number of the functional classes considered, and also that the classes benefitted most by this approach are those containing the fewest members. In addition, we show how our proposed framework can be used for integrating information from the entire GO hierarchy for improving the accuracy of predictions made over a set of base classes. Finally, we provide qualitative and quantitative evidence that this incorporation of functional inter-relationships enables the discovery of interesting biology in the form of novel functional annotations for several yeast proteins, such as Sna4, Rtn1 and Lin1.</p> <p>Conclusion</p> <p>We implemented and evaluated a methodology for incorporating interrelationships between functional classes into a standard classification-based protein function prediction algorithm. Our results show that this incorporation can help improve the accuracy of such algorithms, and help uncover novel biology in the form of previously unknown functional annotations. The complete source code, a sample data set and the additional files for this paper are available free of charge for non-commercial use at <url>http://www.cs.umn.edu/vk/gaurav/functionalsimilarity/</url>.</p

    IGF-I induced genes in stromal fibroblasts predict the clinical outcome of breast and lung cancer patients

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Insulin-like growth factor-1 (IGF-I) signalling is important for cancer initiation and progression. Given the emerging evidence for the role of the stroma in these processes, we aimed to characterize the effects of IGF-I on cancer cells and stromal cells separately.</p> <p>Methods</p> <p>We used an <it>ex vivo </it>culture model and measured gene expression changes after IGF-I stimulation with cDNA microarrays. <it>In vitro </it>data were correlated with <it>in vivo </it>findings by comparing the results with published expression datasets on human cancer biopsies.</p> <p>Results</p> <p>Upon stimulation with IGF-I, breast cancer cells and stromal fibroblasts show some common and other distinct response patterns. Among the up-regulated genes in the stromal fibroblasts we observed a significant enrichment in proliferation associated genes. The expression of the IGF-I induced genes was coherent and it provided a basis for the segregation of the patients into two groups. Patients with tumours with highly expressed IGF-I induced genes had a significantly lower survival rate than patients whose tumours showed lower levels of IGF-I induced gene expression (<it>P </it>= 0.029 - Norway/Stanford and <it>P </it>= 7.96e-09 - NKI dataset). Furthermore, based on an IGF-I induced gene expression signature derived from primary lung fibroblasts, a separation of prognostically different lung cancers was possible (<it>P </it>= 0.007 - Bhattacharjee and <it>P </it>= 0.008 - Garber dataset).</p> <p>Conclusion</p> <p>Expression patterns of genes induced by IGF-I in primary breast and lung fibroblasts accurately predict outcomes in breast and lung cancer patients. Furthermore, these IGF-I induced gene signatures derived from stromal fibroblasts might be promising predictors for the response to IGF-I targeted therapies.</p> <p>See the related commentary by Werner and Bruchim: <url>http://www.biomedcentral.com/1741-7015/8/2</url></p
    corecore