77 research outputs found

    Scientific knowledge in the age of computation

    Get PDF
    With increasing publication and data production, scientific knowledge presents not simply an achievement but also a challenge. Scientific publications and data are increasingly treated as resources that need to be digitally ‘managed.’ This gives rise to scientific Knowledge Management : second-order scientific work aiming to systematically collect, take care of and mobilise first-hand disciplinary knowledge and data in order to provide new first-order scientific knowledge. We follow the work of Leonelli, Efstathiou and Hislop in our analysis of the use of KM in semantic systems biology. Through an empirical philosophical account of KM-enabled biological research, we argue that KM helps produce new first-order biological knowledge that did not exist before, and which could not have been produced by traditional means. KM work is enabled by conceiving of ‘knowledge’ as an object for computational science: as explicated in the text of biological articles and computable via appropriate data and metadata. However, these founded knowledge concepts enabling computational KM risk focusing on only computationally tractable data as knowledge, underestimating practice-based knowing and its significance in ensuring the validity of ‘manageable’ knowledge as knowledge

    A Comprehensive Analysis of the Structure-Function Relationship in Proteins Based on Local Structure Similarity

    Get PDF
    BACKGROUND:Sequence similarity to characterized proteins provides testable functional hypotheses for less than 50% of the proteins identified by genome sequencing projects. With structural genomics it is believed that structural similarities may give functional hypotheses for many of the remaining proteins. METHODOLOGY/PRINCIPAL FINDINGS:We provide a systematic analysis of the structure-function relationship in proteins using the novel concept of local descriptors of protein structure. A local descriptor is a small substructure of a protein which includes both short- and long-range interactions. We employ a library of commonly reoccurring local descriptors general enough to assemble most existing protein structures. We then model the relationship between these local shapes and Gene Ontology using rule-based learning. Our IF-THEN rule model offers legible, high resolution descriptions that combine local substructures and is able to discriminate functions even for functionally versatile folds such as the frequently occurring TIM barrel and Rossmann fold. By evaluating the predictive performance of the model, we provide a comprehensive quantification of the structure-function relationship based only on local structure similarity. Our findings are, among others, that conserved structure is a stronger prerequisite for enzymatic activity than for binding specificity, and that structure-based predictions complement sequence-based predictions. The model is capable of generating correct hypotheses, as confirmed by a literature study, even when no significant sequence similarity to characterized proteins exists. CONCLUSIONS/SIGNIFICANCE:Our approach offers a new and complete description and quantification of the structure-function relationship in proteins. By demonstrating how our predictions offer higher sensitivity than using global structure, and complement the use of sequence, we show that the presented ideas could advance the development of meta-servers in function prediction

    The gastrin and cholecystokinin receptors mediated signaling network : a scaffold for data analysis and new hypotheses on regulatory mechanisms

    Get PDF
    Abstract Background The gastrointestinal peptide hormones cholecystokinin and gastrin exert their biological functions via cholecystokinin receptors CCK1R and CCK2R respectively. Gastrin, a central regulator of gastric acid secretion, is involved in growth and differentiation of gastric and colonic mucosa, and there is evidence that it is pro-carcinogenic. Cholecystokinin is implicated in digestion, appetite control and body weight regulation, and may play a role in several digestive disorders. Results We performed a detailed analysis of the literature reporting experimental evidence on signaling pathways triggered by CCK1R and CCK2R, in order to create a comprehensive map of gastrin and cholecystokinin-mediated intracellular signaling cascades. The resulting signaling map captures 413 reactions involving 530 molecular species, and incorporates the currently available knowledge into one integrated signaling network. The decomposition of the signaling map into sub-networks revealed 18 modules that represent higher-level structures of the signaling map. These modules allow a more compact mapping of intracellular signaling reactions to known cell behavioral outcomes such as proliferation, migration and apoptosis. The integration of large-scale protein-protein interaction data to this literature-based signaling map in combination with topological analyses allowed us to identify 70 proteins able to increase the compactness of the map. These proteins represent experimentally testable hypotheses for gaining new knowledge on gastrin- and cholecystokinin receptor signaling. The CCKR map is freely available both in a downloadable, machine-readable SBML-compatible format and as a web resource through PAYAO ( http://sblab.celldesigner.org:18080/Payao11/bin/ ). Conclusion We have demonstrated how a literature-based CCKR signaling map together with its protein interaction extensions can be analyzed to generate new hypotheses on molecular mechanisms involved in gastrin- and cholecystokinin-mediated regulation of cellular processes

    Functional studies on transfected cell microarray analysed by linear regression modelling

    Get PDF
    Transfected cell microarray is a promising method for accelerating the functional exploration of the genome, giving information about protein function in the living cell. The microarrays consist of clusters of cells (spots) overexpressing or silencing a particular gene product. The subsequent analysis of the phenotypic consequences of such perturbations can then be detected using cell-based assays. The focus in the present study was to establish an experimental design and a robust analysis approach for fluorescence intensity data, and to address the use of replicates for studying regulation of gene expression with varying complexity and effect size. Our analysis pipeline includes measurement of fluorescence intensities, normalization strategies using negative control spots and internal control plasmids, and linear regression (ANOVA) modelling for estimating biological effects and calculating P-values for comparisons of interests. Our results show the potential of transfected cell microarrays in studying complex regulation of gene expression by enabling measurement of biological responses in cells with overexpression and downregulation of specific gene products, combined with the possibility of assaying the effects of external stimuli. Simulation experiments show that transfected cell microarrays can be used to reliably detect even quantitatively minor biological effects by including several technical and experimental replicates

    GeneTools – application for functional annotation and statistical hypothesis testing

    Get PDF
    BACKGROUND: Modern biology has shifted from "one gene" approaches to methods for genomic-scale analysis like microarray technology, which allow simultaneous measurement of thousands of genes. This has created a need for tools facilitating interpretation of biological data in "batch" mode. However, such tools often leave the investigator with large volumes of apparently unorganized information. To meet this interpretation challenge, gene-set, or cluster testing has become a popular analytical tool. Many gene-set testing methods and software packages are now available, most of which use a variety of statistical tests to assess the genes in a set for biological information. However, the field is still evolving, and there is a great need for "integrated" solutions. RESULTS: GeneTools is a web-service providing access to a database that brings together information from a broad range of resources. The annotation data are updated weekly, guaranteeing that users get data most recently available. Data submitted by the user are stored in the database, where it can easily be updated, shared between users and exported in various formats. GeneTools provides three different tools: i) NMC Annotation Tool, which offers annotations from several databases like UniGene, Entrez Gene, SwissProt and GeneOntology, in both single- and batch search mode. ii) GO Annotator Tool, where users can add new gene ontology (GO) annotations to genes of interest. These user defined GO annotations can be used in further analysis or exported for public distribution. iii) eGOn, a tool for visualization and statistical hypothesis testing of GO category representation. As the first GO tool, eGOn supports hypothesis testing for three different situations (master-target situation, mutually exclusive target-target situation and intersecting target-target situation). An important additional function is an evidence-code filter that allows users, to select the GO annotations for the analysis. CONCLUSION: GeneTools is the first "all in one" annotation tool, providing users with a rapid extraction of highly relevant gene annotation data for e.g. thousands of genes or clones at once. It allows a user to define and archive new GO annotations and it supports hypothesis testing related to GO category representations. GeneTools is freely available through www.genetools.n

    Platelet activating factor stimulates arachidonic acid release in differentiated keratinocytes via arachidonyl non-selective phospholipase A2

    Get PDF
    Platelet activating factor (PAF, 1-O-alkyl-2-acetyl-sn-glycero-3-phosphocholine) is known to be present in excess in psoriatic skin, but its exact role is uncertain. In the present study we demonstrate for the first time the role of group VI PLA2 in PAF-induced arachidonic acid release in highly differentiated human keratinocytes. The group IVα PLA2 also participates in the release, while secretory PLA2s play a minor role. Two anti-inflammatory synthetic fatty acids, tetradecylthioacetic acid and tetradecylselenoacetic acid, are shown to interfere with signalling events upstream of group IVα PLA2 activation. In summary, our major novel finding is the involvement of the arachidonyl non-selective group VI PLA2 in PAF-induced inflammatory responses

    Finding gene regulatory network candidates using the gene expression knowledge base

    Get PDF
    BACKGROUND: Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of ‘omics’ data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. RESULTS: We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. CONCLUSIONS: Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0386-y) contains supplementary material, which is available to authorized users
    corecore