75,679 research outputs found

    Representing and analysing molecular and cellular function in the computer

    Get PDF
    Determining the biological function of a myriad of genes, and understanding how they interact to yield a living cell, is the major challenge of the post genome-sequencing era. The complexity of biological systems is such that this cannot be envisaged without the help of powerful computer systems capable of representing and analysing the intricate networks of physical and functional interactions between the different cellular components. In this review we try to provide the reader with an appreciation of where we stand in this regard. We discuss some of the inherent problems in describing the different facets of biological function, give an overview of how information on function is currently represented in the major biological databases, and describe different systems for organising and categorising the functions of gene products. In a second part, we present a new general data model, currently under development, which describes information on molecular function and cellular processes in a rigorous manner. The model is capable of representing a large variety of biochemical processes, including metabolic pathways, regulation of gene expression and signal transduction. It also incorporates taxonomies for categorising molecular entities, interactions and processes, and it offers means of viewing the information at different levels of resolution, and dealing with incomplete knowledge. The data model has been implemented in the database on protein function and cellular processes 'aMAZE' (http://www.ebi.ac.uk/research/pfbp/), which presently covers metabolic pathways and their regulation. Several tools for querying, displaying, and performing analyses on such pathways are briefly described in order to illustrate the practical applications enabled by the model

    ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space

    Full text link
    Studying the function of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the determination of the function of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the determination of protein functions in living cells. In this paper, we propose ProtNN, a novel approach for protein function prediction. Given an unannotated protein structure and a set of annotated proteins, ProtNN finds the nearest neighbor annotated structures based on protein-graph pairwise similarities. Given a query protein, ProtNN finds the nearest neighbor reference proteins based on a graph representation model and a pairwise similarity between vector embedding of both query and reference protein-graphs in structural and topological spaces. ProtNN assigns to the query protein the function with the highest number of votes across the set of k nearest neighbor reference proteins, where k is a user-defined parameter. Experimental evaluation demonstrates that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude of runtime compared to state-of-the-art approaches

    Lipid storage and autophagy in melanoma cancer cells

    Get PDF
    Cancer stem cells (CSC) represent a key cellular subpopulation controlling biological features such as cancer progression in all cancer types. By using melanospheres established from human melanoma patients, we compared less differentiated melanosphere-derived CSC to differentiating melanosphere-derived cells. Increased lipid uptake was found in melanosphere-derived CSC vs. differentiating melanosphere-derived cells, paralleled by strong expression of lipogenic factors Sterol Regulatory Element-Binding Protein-1 (SREBP-1) and Peroxisome Proliferator-Activated Receptor-γ (PPAR-γ). An inverse relation between lipid-storing phenotype and autophagy was also found, since microtubule-associated protein 1A/1B-Light Chain 3 (LC3) lipidation is reduced in melanosphere-derived CSC. To investigate upstream autophagy regulators, Phospho-AMP activated Protein Kinase (P-AMPK) and Phospho-mammalian Target of Rapamycin (P-mTOR) were analyzed; lower P-AMPK and higher P-mTOR expression in melanosphere-derived CSC were found, thus explaining, at least in part, their lower autophagic activity. In addition, co-localization of LC3-stained autophagosome spots and perilipin-stained lipid droplets was demonstrated mainly in differentiating melanosphere-derived cells, further supporting the role of autophagy in lipid droplets clearance. The present manuscript demonstrates an inverse relationship between lipid-storing phenotype and melanoma stem cells differentiation, providing novel indications involving autophagy in melanoma stem cells biology

    Computational analysis of a plant receptor interaction network

    Full text link
    Trabajo fin de máster en Bioinformática y Biología ComputacionalIn all organisms, complex protein-protein interactions (PPI) networks control major biological functions yet studying their structural features presents a major analytical challenge. In plants, leucine-rich-repeat receptor kinases (LRR-RKs) are key in sensing and transmitting non-self as well as self-signals from the cell surface. As such, LRR-RKs have both developmental and immune functions that allow plants to make the most of their environments. In the model organism in plant molecular biology, Arabidopsis thaliana, most LRR-RKs are still represented by biochemically and genetically uncharacterized receptors. To fix this an LRR-based Cell Surface Interaction (CSI LRR ) network was obtained in 2018, a protein-protein interaction network of the extracellular domain of 170 LRR-RKs that contains 567 bidirectional interactions. Several network analyses have been performed with CSI LRR . However, these analyses have so far not considered the spatial and temporal expression of its proteins. Neither has it been characterized in detail the role of the extracellular domain (ECD) size in the network structure. Because of that, the objective of the present work is to continue with more in depth analyses with the CSI LRR network. This would provide important insights that will facilitate LRR-RKs function characterization. The first aim of this work is to test out the fit of the CSI LRR network to a scale-free topology. To accomplish that, the degree distribution of the CSI LRR network was compared with the degree distribution of the known network models of scale-free and random. Additionally, three network attack algorithms were implemented and applied to these two network models and the CSI LRR network to compare their behavior. However, since the CSI LRR interaction data comes from an in vitro screening, there is no direct evidence whether its protein-protein interactions occur inside the plant cells. To gain insight on how the network composition changes depending on the transcriptional regulation, the interaction data of the CSI LRR was integrated with 4 different RNA-Seq datasets related with the network biological functions. To automatize this task a Python script was written. Furthermore, it was evaluated the role of the LRR-RKs in the network structure depending on the size of their extracellular domain (large or small). For that, centrality parameters were measured, and size-targeted attacks performed. Finally, gene regulatory information was integrated into the CSI LRR to classify the different network proteins according to the function of the transcription factors that regulate its expression. The results were that CSI LRR fits a power law degree distribution and approximates a scale- free topology. Moreover, CSI LRR displays high resistance to random attacks and reduced resistance to hub/bottleneck-directed attacks, similarly to scale-free network model. Also, the integration of CSI LRR interaction data and RNA-Seq data suggests that the transcriptional regulation of the network is more relevant for developmental programs than for defense responses. Another result was that the LRR-RKs with a small ECD size have a major role in the maintenance of the CSI LRR integrity. Lastly, it was hypothesized that the integration of CSI LRR interaction data with predicted gene regulatory networks could shed light upon the functioning of growth-immunity signaling crosstalk

    Open source bioimage informatics for cell biology

    Get PDF
    Significant technical advances in imaging, molecular biology and genomics have fueled a revolution in cell biology, in that the molecular and structural processes of the cell are now visualized and measured routinely. Driving much of this recent development has been the advent of computational tools for the acquisition, visualization, analysis and dissemination of these datasets. These tools collectively make up a new subfield of computational biology called bioimage informatics, which is facilitated by open source approaches. We discuss why open source tools for image informatics in cell biology are needed, some of the key general attributes of what make an open source imaging application successful, and point to opportunities for further operability that should greatly accelerate future cell biology discovery

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Complex Systems Science: Dreams of Universality, Reality of Interdisciplinarity

    Get PDF
    Using a large database (~ 215 000 records) of relevant articles, we empirically study the "complex systems" field and its claims to find universal principles applying to systems in general. The study of references shared by the papers allows us to obtain a global point of view on the structure of this highly interdisciplinary field. We show that its overall coherence does not arise from a universal theory but instead from computational techniques and fruitful adaptations of the idea of self-organization to specific systems. We also find that communication between different disciplines goes through specific "trading zones", ie sub-communities that create an interface around specific tools (a DNA microchip) or concepts (a network).Comment: Journal of the American Society for Information Science and Technology (2012) 10.1002/asi.2264

    TumorML: Concept and requirements of an in silico cancer modelling markup language

    No full text
    This paper describes the initial groundwork carried out as part of the European Commission funded Transatlantic Tumor Model Repositories project, to develop a new markup language for computational cancer modelling, TumorML. In this paper we describe the motivations for such a language, arguing that current state-of-the-art biomodelling languages are not suited to the cancer modelling domain. We go on to describe the work that needs to be done to develop TumorML, the conceptual design, and a description of what existing markup languages will be used to compose the language specification
    corecore