10,327 research outputs found
A search engine to identify pathway genes from expression data on multiple organisms
<p>Abstract</p> <p>Background</p> <p>The completion of several genome projects showed that most genes have not yet been characterized, especially in multicellular organisms. Although most genes have unknown functions, a large collection of data is available describing their transcriptional activities under many different experimental conditions. In many cases, the coregulatation of a set of genes across a set of conditions can be used to infer roles for genes of unknown function.</p> <p>Results</p> <p>We developed a search engine, the Multiple-Species Gene Recommender (MSGR), which scans gene expression datasets from multiple organisms to identify genes that participate in a genetic pathway. The MSGR takes a query consisting of a list of genes that function together in a genetic pathway from one of six organisms: <it>Homo sapiens</it>, <it>Drosophila melanogaster</it>, <it>Caenorhabditis elegans</it>, <it>Saccharomyces cerevisiae</it>, <it>Arabidopsis thaliana</it>, and <it>Helicobacter pylori</it>. Using a probabilistic method to merge searches, the MSGR identifies genes that are significantly coregulated with the query genes in one or more of those organisms. The MSGR achieves its highest accuracy for many human pathways when searches are combined across species. We describe specific examples in which new genes were identified to be involved in a neuromuscular signaling pathway and a cell-adhesion pathway.</p> <p>Conclusion</p> <p>The search engine can scan large collections of gene expression data for new genes that are significantly coregulated with a pathway of interest. By integrating searches across organisms, the MSGR can identify pathway members whose coregulation is either ancient or newly evolved.</p
Metabolic and Chaperone Gene Loss Marks the Origin of Animals: Evidence for Hsp104 and Hsp78 Sharing Mitochondrial Clients
The evolution of animals involved acquisition of an emergent gene repertoire
for gastrulation. Whether loss of genes also co-evolved with this developmental
reprogramming has not yet been addressed. Here, we identify twenty-four genetic
functions that are retained in fungi and choanoflagellates but undetectable in
animals. These lost genes encode: (i) sixteen distinct biosynthetic functions;
(ii) the two ancestral eukaryotic ClpB disaggregases, Hsp78 and Hsp104, which
function in the mitochondria and cytosol, respectively; and (iii) six other
assorted functions. We present computational and experimental data that are
consistent with a joint function for the differentially localized ClpB
disaggregases, and with the possibility of a shared client/chaperone
relationship between the mitochondrial Fe/S homoaconitase encoded by the lost
LYS4 gene and the two ClpBs. Our analyses lead to the hypothesis that the
evolution of gastrulation-based multicellularity in animals led to efficient
extraction of nutrients from dietary sources, loss of natural selection for
maintenance of energetically expensive biosynthetic pathways, and subsequent
loss of their attendant ClpB chaperones.Comment: This is a reformatted version from the recent official publication in
PLoS ONE (2015). This version differs substantially from first three arXiV
versions. This version uses a fixed-width font for DNA sequences as was done
in the earlier arXiv versions but which is missing in the official PLoS ONE
publication. The title has also been shortened slightly from the official
publicatio
BioCloud Search EnGene: Surfing Biological Data on the Cloud
The massive production and spread of biomedical data around the web introduces new challenges related to identify computational approaches for providing quality search and browsing of web resources. This papers presents BioCloud Search EnGene (BSE), a cloud application that facilitates searching and integration of the many layers of biological information offered by public large-scale genomic repositories. Grounding on the concept of dataspace, BSE is built on top of a cloud platform that severely curtails issues associated with scalability and performance. Like popular online gene portals, BSE adopts a gene-centric approach: researchers can find their information of interest by means of a simple “Google-like” query interface that accepts standard gene identification as keywords. We present BSE architecture and functionality and discuss how our strategies contribute to successfully tackle big data problems in querying gene-based web resources. BSE is publically available at: http://biocloud-unica.appspot.com/
ProteoClade: A taxonomic toolkit for multi-species and metaproteomic analysis
We present ProteoClade, a Python toolkit that performs taxa-specific peptide assignment, protein inference, and quantitation for multi-species proteomics experiments. ProteoClade scales to hundreds of millions of protein sequences, requires minimal computational resources, and is open source, multi-platform, and accessible to non-programmers. We demonstrate its utility for processing quantitative proteomic data derived from patient-derived xenografts and its speed and scalability enable a novel de novo proteomic workflow for complex microbiota samples
A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi
Effective research in parasite biology requires analyzing experimental lab data in the context of constantly expanding public data resources. Integrating lab data with public resources is particularly difficult for biologists who may not possess significant computational skills to acquire and process heterogeneous data stored at different locations. Therefore, we develop a semantic problem solving environment (SPSE) that allows parasitologists to query their lab data integrated with public resources using ontologies. An ontology specifies a common vocabulary and formal relationships among the terms that describe an organism, and experimental data and processes in this case. SPSE supports capturing and querying provenance information, which is metadata on the experimental processes and data recorded for reproducibility, and includes a visual query-processing tool to formulate complex queries without learning the query language syntax. We demonstrate the significance of SPSE in identifying gene knockout targets for T. cruzi. The overall goal of SPSE is to help researchers discover new or existing knowledge that is implicitly present in the data but not always easily detected. Results demonstrate improved usefulness of SPSE over existing lab systems and approaches, and support for complex query design that is otherwise difficult to achieve without the knowledge of query language syntax
The BioGRID Interaction Database: 2011 update
The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein
interaction data from model organisms and humans
(http://www.thebiogrid.org). BioGRID currently holds 347 966
interactions (170 162 genetic, 177 804 protein) curated from both
high-throughput data sets and individual focused studies, as derived
from over 23 000 publications in the primary literature. Complete
coverage of the entire literature is maintained for budding yeast
(Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe)
and thale cress (Arabidopsis thaliana), and efforts to expand curation
across multiple metazoan species are underway. The BioGRID houses 48
831 human protein interactions that have been curated from 10 247
publications. Current curation drives are focused on particular areas
of biology to enable insights into conserved networks and pathways that
are relevant to human health. The BioGRID 3.0 web interface contains
new search and display features that enable rapid queries across
multiple data types and sources. An automated Interaction Management
System (IMS) is used to prioritize, coordinate and track curation
across international sites and projects. BioGRID provides interaction
data to several model organism databases, resources such as Entrez-Gene
and other interaction meta-databases. The entire BioGRID 3.0 data
collection may be downloaded in multiple file formats, including PSI MI
XML. Source code for BioGRID 3.0 is freely available without any
restrictions
Web-based metabolic network visualization with a zooming user interface
<p>Abstract</p> <p>Background</p> <p>Displaying complex metabolic-map diagrams, for Web browsers, and allowing users to interact with them for querying and overlaying expression data over them is challenging.</p> <p>Description</p> <p>We present a Web-based metabolic-map diagram, which can be interactively explored by the user, called the <it>Cellular Overview</it>. The main characteristic of this application is the zooming user interface enabling the user to focus on appropriate granularities of the network at will. Various searching commands are available to visually highlight sets of reactions, pathways, enzymes, metabolites, and so on. Expression data from single or multiple experiments can be overlaid on the diagram, which we call the Omics Viewer capability. The application provides Web services to highlight the diagram and to invoke the <it>Omics Viewer</it>. This application is entirely written in JavaScript for the client browsers and connect to a Pathway Tools Web server to retrieve data and diagrams. It uses the OpenLayers library to display tiled diagrams.</p> <p>Conclusions</p> <p>This new online tool is capable of displaying large and complex metabolic-map diagrams in a very interactive manner. This application is available as part of the Pathway Tools software that powers multiple metabolic databases including <monospace>Biocyc.org</monospace>: The Cellular Overview is accessible under the <monospace>Tools</monospace> menu.</p
Blueprint: descrição da complexidade da regulação metabólica através da reconstrução de modelos metabólicos e regulatórios integrados
Tese de doutoramento em Biomedical EngineeringUm modelo metabólico consegue prever o fenótipo de um organismo. No entanto, estes modelos
podem obter previsões incorretas, pois alguns processos metabólicos são controlados por mecanismos
reguladores. Assim, várias metodologias foram desenvolvidas para melhorar os modelos metabólicos
através da integração de redes regulatórias. Todavia, a reconstrução de modelos regulatórios e metabólicos à escala genómica para diversos organismos apresenta diversos desafios.
Neste trabalho, propõe-se o desenvolvimento de diversas ferramentas para a reconstrução e análise
de modelos metabólicos e regulatórios à escala genómica. Em primeiro lugar, descreve-se o Biological
networks constraint-based In Silico Optimization (BioISO), uma nova ferramenta para auxiliar a curação
manual de modelos metabólicos. O BioISO usa um algoritmo de relação recursiva para orientar as previsões de fenótipo. Assim, esta ferramenta pode reduzir o número de artefatos em modelos metabólicos,
diminuindo a possibilidade de obter erros durante a fase de curação.
Na segunda parte deste trabalho, desenvolveu-se um repositório de redes regulatórias para procariontes que permite suportar a sua integração em modelos metabólicos. O Prokaryotic Transcriptional
Regulatory Network Database (ProTReND) inclui diversas ferramentas para extrair e processar informação regulatória de recursos externos. Esta ferramenta contém um sistema de integração de dados que
converte dados dispersos de regulação em redes regulatórias integradas. Além disso, o ProTReND dispõe
de uma aplicação que permite o acesso total aos dados regulatórios.
Finalmente, desenvolveu-se uma ferramenta computacional no MEWpy para simular e analisar modelos regulatórios e metabólicos. Esta ferramenta permite ler um modelo metabólico e/ou rede regulatória,
em diversos formatos. Esta estrutura consegue construir um modelo regulatório e metabólico integrado
usando as interações regulatórias e as ligações entre genes e proteínas codificadas no modelo metabólico e na rede regulatória. Além disso, esta estrutura suporta vários métodos de previsão de fenótipo
implementados especificamente para a análise de modelos regulatórios-metabólicos.Genome-Scale Metabolic (GEM) models can predict the phenotypic behavior of organisms. However,
these models can lead to incorrect predictions, as certain metabolic processes are controlled by regulatory
mechanisms. Accordingly, many methodologies have been developed to extend the reconstruction and
analysis of GEM models via the integration of Transcriptional Regulatory Network (TRN)s. Nevertheless,
the perspective of reconstructing integrated genome-scale regulatory and metabolic models for diverse
prokaryotes is still an open challenge.
In this work, we propose several tools to assist the reconstruction and analysis of regulatory and
metabolic models. We start by describing BioISO, a novel tool to assist the manual curation of GEM
models. BioISO uses a recursive relation-like algorithm and Flux Balance Analysis (FBA) to evaluate and
guide debugging of in silico phenotype predictions. Hence, this tool can reduce the number of artifacts in
GEM models, decreasing the burdens of model refinement and curation.
A state-of-the-art repository of TRNs for prokaryotes was implemented to support the reconstruction
and integration of TRNs into GEM models. The ProTReND repository comprehends several tools to extract
and process regulatory information available in several resources. More importantly, this repository contains a data integration system to unify the regulatory data into standardized TRNs at the genome scale.
In addition, ProTReND contains a web application with full access to the regulatory data.
Finally, we have developed a new modeling framework to define, simulate and analyze GEnome-scale
Regulatory and Metabolic (GERM) models in MEWpy. The GERM model framework can read a GEM
model, as well as a TRN from different file formats. This framework assembles a GERM model using
the regulatory interactions and Genes-Proteins-Reactions (GPR) rules encoded into the GEM model and
TRN. In addition, this modeling framework supports several methods of phenotype prediction designed
for regulatory-metabolic models.I would like to thank Fundação para a Ciência e Tecnologia for the Ph.D. studentship I was awarded
with (SFRH/BD/139198/2018)
Global proteomics analysis of the response to starvation in <i>C. elegans</i>
Periodic starvation of animals induces large shifts in metabolism but may also influence many other cellular systems and can lead to adaption to prolonged starvation conditions. To date, there is limited understanding of how starvation affects gene expression, particularly at the protein level. Here, we have used mass-spectrometry-based quantitative proteomics to identify global changes in the Caenorhabditis elegans proteome due to acute starvation of young adult animals. Measuring changes in the abundance of over 5,000 proteins, we show that acute starvation rapidly alters the levels of hundreds of proteins, many involved in central metabolic pathways, highlighting key regulatory responses. Surprisingly, we also detect changes in the abundance of chromatin-associated proteins, including specific linker histones, histone variants, and histone posttranslational modifications associated with the epigenetic control of gene expression. To maximize community access to these data, they are presented in an online searchable database, the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd/)
- …