257 research outputs found
Final Report on MITRE Evaluations for the DARPA Big Mechanism Program
This report presents the evaluation approach developed for the DARPA Big
Mechanism program, which aimed at developing computer systems that will read
research papers, integrate the information into a computer model of cancer
mechanisms, and frame new hypotheses. We employed an iterative, incremental
approach to the evaluation of the three phases of the program. In Phase I, we
evaluated the ability of system and human teams ability to read-with-a-model to
capture mechanistic information from the biomedical literature, integrated with
information from expert curated biological databases. In Phase II we evaluated
the ability of systems to assemble fragments of information into a mechanistic
model. The Phase III evaluation focused on the ability of systems to provide
explanations of experimental observations based on models assembled (largely
automatically) by the Big Mechanism process. The evaluation for each phase
built on earlier evaluations and guided developers towards creating
capabilities for the new phase. The report describes our approach, including
innovations such as a reference set (a curated data set limited to major
findings of each paper) to assess the accuracy of systems in extracting
mechanistic findings in the absence of a gold standard, and a method to
evaluate model-based explanations of experimental data. Results of the
evaluation and supporting materials are included in the appendices.Comment: 46 pages, 8 figure
Blueprint: descrição da complexidade da regulação metabólica através da reconstrução de modelos metabólicos e regulatórios integrados
Tese de doutoramento em Biomedical EngineeringUm modelo metabólico consegue prever o fenótipo de um organismo. No entanto, estes modelos
podem obter previsões incorretas, pois alguns processos metabólicos são controlados por mecanismos
reguladores. Assim, várias metodologias foram desenvolvidas para melhorar os modelos metabólicos
através da integração de redes regulatórias. Todavia, a reconstrução de modelos regulatórios e metabólicos à escala genómica para diversos organismos apresenta diversos desafios.
Neste trabalho, propõe-se o desenvolvimento de diversas ferramentas para a reconstrução e análise
de modelos metabólicos e regulatórios à escala genómica. Em primeiro lugar, descreve-se o Biological
networks constraint-based In Silico Optimization (BioISO), uma nova ferramenta para auxiliar a curação
manual de modelos metabólicos. O BioISO usa um algoritmo de relação recursiva para orientar as previsões de fenótipo. Assim, esta ferramenta pode reduzir o número de artefatos em modelos metabólicos,
diminuindo a possibilidade de obter erros durante a fase de curação.
Na segunda parte deste trabalho, desenvolveu-se um repositório de redes regulatórias para procariontes que permite suportar a sua integração em modelos metabólicos. O Prokaryotic Transcriptional
Regulatory Network Database (ProTReND) inclui diversas ferramentas para extrair e processar informação regulatória de recursos externos. Esta ferramenta contém um sistema de integração de dados que
converte dados dispersos de regulação em redes regulatórias integradas. Além disso, o ProTReND dispõe
de uma aplicação que permite o acesso total aos dados regulatórios.
Finalmente, desenvolveu-se uma ferramenta computacional no MEWpy para simular e analisar modelos regulatórios e metabólicos. Esta ferramenta permite ler um modelo metabólico e/ou rede regulatória,
em diversos formatos. Esta estrutura consegue construir um modelo regulatório e metabólico integrado
usando as interações regulatórias e as ligações entre genes e proteínas codificadas no modelo metabólico e na rede regulatória. Além disso, esta estrutura suporta vários métodos de previsão de fenótipo
implementados especificamente para a análise de modelos regulatórios-metabólicos.Genome-Scale Metabolic (GEM) models can predict the phenotypic behavior of organisms. However,
these models can lead to incorrect predictions, as certain metabolic processes are controlled by regulatory
mechanisms. Accordingly, many methodologies have been developed to extend the reconstruction and
analysis of GEM models via the integration of Transcriptional Regulatory Network (TRN)s. Nevertheless,
the perspective of reconstructing integrated genome-scale regulatory and metabolic models for diverse
prokaryotes is still an open challenge.
In this work, we propose several tools to assist the reconstruction and analysis of regulatory and
metabolic models. We start by describing BioISO, a novel tool to assist the manual curation of GEM
models. BioISO uses a recursive relation-like algorithm and Flux Balance Analysis (FBA) to evaluate and
guide debugging of in silico phenotype predictions. Hence, this tool can reduce the number of artifacts in
GEM models, decreasing the burdens of model refinement and curation.
A state-of-the-art repository of TRNs for prokaryotes was implemented to support the reconstruction
and integration of TRNs into GEM models. The ProTReND repository comprehends several tools to extract
and process regulatory information available in several resources. More importantly, this repository contains a data integration system to unify the regulatory data into standardized TRNs at the genome scale.
In addition, ProTReND contains a web application with full access to the regulatory data.
Finally, we have developed a new modeling framework to define, simulate and analyze GEnome-scale
Regulatory and Metabolic (GERM) models in MEWpy. The GERM model framework can read a GEM
model, as well as a TRN from different file formats. This framework assembles a GERM model using
the regulatory interactions and Genes-Proteins-Reactions (GPR) rules encoded into the GEM model and
TRN. In addition, this modeling framework supports several methods of phenotype prediction designed
for regulatory-metabolic models.I would like to thank Fundação para a Ciência e Tecnologia for the Ph.D. studentship I was awarded
with (SFRH/BD/139198/2018)
Implementation of new tools and approaches for the reconstruction of genome-scale metabolic models
Dissertação de mestrado em BioinformáticaThe reconstruction of high-quality genome-scale metabolic (GSM) models can have a rele vant role in the investigation and study of an organism, since these mathematical models can
be used to phenotypically manipulate an organism and predict its response, in silico, under
different environmental conditions or genetic modifications. Several bioinformatics tools
and software have been developed since then to facilitate and accelerate the reconstruction of
these models by automating some steps that compose the traditional reconstruction process.
“Metabolic Models Reconstruction Using Genome-Scale Information” (merlin) is a free,
user-friendly, JavaTM application that automates the main stages of the reconstruction of
a GSM model for any microorganism. Although it has already been used successfully in
several works, many plugins are still being developed to improve its resources and make it
more accessible to any user. In this work, the new tools integrated in merlin will be described
in detail, as well as the improvement of other features present on the platform. The general
improvements performed and the implementation of the new tools, improve the overall user
experience during the process of reconstructing GSM models in merlin.
The main feature implemented in this work is the incorporation of the BiGG Integration Tool
(BIT) in merlin. This plugin allows the collection of metabolic data that integrates the models
present in the BiGG Models database and its association with the genome of the organism
in study, by homology, creating, if possible, the boolean rule for each BiGG reaction in the
model under construction. All the computation required to execute merlin’s BIT takes place
remotely, to accelerate the process. Within a few minutes, the results are returned by the
server and imported into the user’s workspace. Running the tool outside the user’s machine
also brings advantages in terms of information storage, since the BiGG data structure that
supports the entire tool is available remotely. The implementation of this tool provides an
alternative to obtaining metabolic information from the KEGG database, the only option
available in merlin so far. To test the implemented tool, several draft genome-scale metabolic
networks were generated and analyzed.A reconstrução de modelos metabólicos à escala genómica (MEG) de alta qualidade, pode
desempenhar um papel relevante na investigação e estudo de um organismo, uma vez que
estes modelos matemáticos podem ser utilizados para manipular fenotipicamente um organ ismo e prever a sua resposta, in silico, sob diferentes condições ambientais ou modificações
genéticas. Várias ferramentas bioinformáticas e software têm sido desenvolvidos desde
então para facilitar e acelerar a reconstrução desses modelos por automatização de algumas
etapas que constituem o processo de reconstrução tradicional.
O “Metabolic Models Reconstruction Using Genome-Scale Information” (merlin) é uma
aplicação JavaTM gratuita, e fácil de utilizar, que automatiza as principais etapas de recon strução de um modelo MEG para qualquer microrganismo. Apesar de já ter sido utilizado
com sucesso em vários trabalhos, muitos plugins ainda estão a ser desenvolvidas para
aprimorar os seus recursos e torná-lo mais acessível a qualquer utilizador. Neste trabalho,
serão descritas em detalhe as novas ferramentas integradas no merlin, bem como a melhoria
de outras funcionalidades presentes na plataforma. As melhorias gerais realizadas e a
implementação das novas ferramentas permitem melhorar a experiência global do utilizador
durante o processo de reconstrução de modelos MEG no merlin.
O principal recurso implementado neste trabalho é a integração da BiGG Integration
Tool (BIT) no merlin. Este plugin permite a recolha dos dados metabólicos que integram
os modelos presentes na base de dados BiGG Models e a sua associação ao genoma do
organismo em estudo, por homologia, criando, se possível, a boolean rule para cada reação
BiGG presente no modelo sob construção. Todo o processamento exigido para executar a BIT
do merlin ocorre remotamente, para acelerar o processo. Em poucos minutos, os resultados
são devolvidos pelo servidor e importados para o ambiente de trabalho do utilizador. A
execução da ferramenta fora da máquina do utilizador traz também vantagens ao nível
do armazenamento da informação, já que a estrutura de dados BiGG que sustenta toda a
ferramenta está disponível remotamente. A implementação desta ferramenta fornece uma
alternativa à obtenção de informação metabólica a partir da base de dados KEGG, única
opção disponibilizada pelo merlin até ao momento. Para testar a ferramenta implementada,
várias redes metabólicas à escala genómica rascunho foram geradas e analisadas
Generation and Applications of Knowledge Graphs in Systems and Networks Biology
The acceleration in the generation of data in the biomedical domain has necessitated the use of computational approaches to assist in its interpretation. However, these approaches rely on the availability of high quality, structured, formalized biomedical knowledge. This thesis has the two goals to improve methods for curation and semantic data integration to generate high granularity biological knowledge graphs and to develop novel methods for using prior biological knowledge to propose new biological hypotheses. The first two publications describe an ecosystem for handling biological knowledge graphs encoded in the Biological Expression Language throughout the stages of curation, visualization, and analysis. Further, the second two publications describe the reproducible acquisition and integration of high-granularity knowledge with low contextual specificity from structured biological data sources on a massive scale and support the semi-automated curation of new content at high speed and precision. After building the ecosystem and acquiring content, the last three publications in this thesis demonstrate three different applications of biological knowledge graphs in modeling and simulation. The first demonstrates the use of agent-based modeling for simulation of neurodegenerative disease biomarker trajectories using biological knowledge graphs as priors. The second applies network representation learning to prioritize nodes in biological knowledge graphs based on corresponding experimental measurements to identify novel targets. Finally, the third uses biological knowledge graphs and develops algorithmics to deconvolute the mechanism of action of drugs, that could also serve to identify drug repositioning candidates. Ultimately, the this thesis lays the groundwork for production-level applications of drug repositioning algorithms and other knowledge-driven approaches to analyzing biomedical experiments
Knowledge Management Approaches for predicting Biomarker and Assessing its Impact on Clinical Trials
The recent success of companion diagnostics along with the increasing regulatory pressure for better identification of the target population has created an unprecedented incentive for the drug discovery companies to invest into novel strategies for stratified biomarker discovery. Catching with this trend, trials with stratified biomarker in drug development have quadrupled in the last decade but represent a small part of all Interventional trials reflecting multiple co-developmental challenges of therapeutic compounds and companion diagnostics. To overcome the challenge, varied knowledge management and system biology approaches are adopted in the clinics to analyze/interpret an ever increasing collection of OMICS data. By semi-automatic screening of more than 150,000 trials, we filtered trials with stratified biomarker to analyse their therapeutic focus, major drivers and elucidated the impact of stratified biomarker programs on trial duration and completion. The analysis clearly shows that cancer is the major focus for trials with stratified biomarker. But targeted therapies in cancer require more accurate stratification of patient population. This can be augmented by a fresh approach of selecting a new class of biomolecules i.e. miRNA as candidate stratification biomarker. miRNA plays an important role in tumorgenesis in regulating expression of oncogenes and tumor suppressors; thus affecting cell proliferation, differentiation, apoptosis, invasion, angiogenesis. miRNAs are potential biomarkers in different cancer. However, the relationship between response of cancer patients towards targeted therapy and resulting modifications of the miRNA transcriptome in pathway regulation is poorly understood. With ever-increasing pathways and miRNA-mRNA interaction databases, freely available mRNA and miRNA expression data in multiple cancer therapy have created an unprecedented opportunity to decipher the role of miRNAs in early prediction of therapeutic efficacy in diseases. We present a novel SMARTmiR algorithm to predict the role of miRNA as therapeutic biomarker for an anti-EGFR monoclonal antibody i.e. cetuximab treatment in colorectal cancer. The application of an optimised and fully automated version of the algorithm has the potential to be used as clinical decision support tool. Moreover this research will also provide a comprehensive and valuable knowledge map demonstrating functional bimolecular interactions in colorectal cancer to scientific community. This research also detected seven miRNA i.e. hsa-miR-145, has-miR-27a, has- miR-155, hsa-miR-182, hsa-miR-15a, hsa-miR-96 and hsa-miR-106a as top stratified biomarker candidate for cetuximab therapy in CRC which were not reported previously. Finally a prospective plan on future scenario of biomarker research in cancer drug development has been drawn focusing to reduce the risk of most expensive phase III drug failures
Reconstruction, Reconciliation, and Validation of Metabolic Networks
University of Minnesota Ph.D. dissertation. May 2018. Major: Plant Biological Sciences. Advisor: Igor Libourel. 1 computer file (PDF); ix, 120 pages.Metabolic networks are rigorous and computable representations of metabolism that describe the connections between genes, enzymes, reactions, and metabolites. The comprehensive nature of metabolic networks has allowed them to become the first truly “genome-scale” models, and they have served as a foundational framework for the broader effort of systems biology, which aims to model all aspects of cellular function. A more thorough and accurate understanding of metabolism has the potential to improve the synthesis of important biological compounds, better model metabolic diseases, and progress towards simulations of entire cells. The thesis research presented here focuses on the reconstruction of organism-specific metabolic networks from genome annotations and methods for improving metabolic networks by reconciling them with observed phenotypes, specifically the synthesis of essential cellular metabolites such as DNA, amino acids, and other small molecules. Gene sequence similarity and estimations of thermodynamic reaction parameters are used to guide network reconciliation through the use of numerical optimization algorithms. Particular attention is devoted to the validation of metabolic networks using experimental data, such as gene essentiality, and the development of computational controls using parameter randomization
- …