31 research outputs found

    Previsão e análise da estrutura e dinâmica de redes biológicas

    Get PDF
    Increasing knowledge about the biological processes that govern the dynamics of living organisms has fostered a better understanding of the origin of many diseases as well as the identification of potential therapeutic targets. Biological systems can be modeled through biological networks, allowing to apply and explore methods of graph theory in their investigation and characterization. This work had as main motivation the inference of patterns and rules that underlie the organization of biological networks. Through the integration of different types of data, such as gene expression, interaction between proteins and other biomedical concepts, computational methods have been developed so that they can be used to predict and study diseases. The first contribution, was the characterization a subsystem of the human protein interactome through the topological properties of the networks that model it. As a second contribution, an unsupervised method using biological criteria and network topology was used to improve the understanding of the genetic mechanisms and risk factors of a disease through co-expression networks. As a third contribution, a methodology was developed to remove noise (denoise) in protein networks, to obtain more accurate models, using the network topology. As a fourth contribution, a supervised methodology was proposed to model the protein interactome dynamics, using exclusively the topology of protein interactions networks that are part of the dynamic model of the system. The proposed methodologies contribute to the creation of more precise, static and dynamic biological models through the identification and use of topological patterns of protein interaction networks, which can be used to predict and study diseases.O conhecimento crescente sobre os processos biológicos que regem a dinâmica dos organismos vivos tem potenciado uma melhor compreensão da origem de muitas doenças, assim como a identificação de potenciais alvos terapêuticos. Os sistemas biológicos podem ser modelados através de redes biológicas, permitindo aplicar e explorar métodos da teoria de grafos na sua investigação e caracterização. Este trabalho teve como principal motivação a inferência de padrões e de regras que estão subjacentes à organização de redes biológicas. Através da integração de diferentes tipos de dados, como a expressão de genes, interação entre proteínas e outros conceitos biomédicos, foram desenvolvidos métodos computacionais, para que possam ser usados na previsão e no estudo de doenças. Como primeira contribuição, foi proposto um método de caracterização de um subsistema do interactoma de proteínas humano através das propriedades topológicas das redes que o modelam. Como segunda contribuição, foi utilizado um método não supervisionado que utiliza critérios biológicos e topologia de redes para, através de redes de co-expressão, melhorar a compreensão dos mecanismos genéticos e dos fatores de risco de uma doença. Como terceira contribuição, foi desenvolvida uma metodologia para remover ruído (denoise) em redes de proteínas, para obter modelos mais precisos, utilizando a topologia das redes. Como quarta contribuição, propôs-se uma metodologia supervisionada para modelar a dinâmica do interactoma de proteínas, usando exclusivamente a topologia das redes de interação de proteínas que fazem parte do modelo dinâmico do sistema. As metodologias propostas contribuem para a criação de modelos biológicos, estáticos e dinâmicos, mais precisos, através da identificação e uso de padrões topológicos das redes de interação de proteínas, que podem ser usados na previsão e no estudo doenças.Programa Doutoral em Engenharia Informátic

    Genetic and environmental prediction of opioid cessation using machine learning, GWAS, and a mouse model

    Full text link
    The United States is currently experiencing an epidemic of opioid use, use disorder, and overdose-related deaths. While studies have identified several loci that are associated with opioid use disorder (OUD) risk, the genetic basis for the ability to discontinue opioid use has not been investigated. Furthermore, very few studies have investigated the non-genetic factors that are predictive of opioid cessation or their predictive ability. In this thesis, I studied a novel phenotype–opioid cessation, defined as the time since last use of illicit opioids (1 year ago as cease) among persons meeting lifetime DSM-5 criteria for opioid use disorder (OUD). In chapter two, I identified novel genetic variants and biological pathways that potentially regulate opioid cessation success through a genome wide study, as well as genetic overlap between opioid cessation and other substance cessation traits. In chapter three, I identified multiple non-genetic risk factors specific to each racial group that are predictive of opioid cessation from the same individuals analyzed in chapter two by applying several linear and non-linear machine learning techniques to a set of more than 3,000 variables assessed by a structured psychiatric interview. Factors identified from this atheoretical approach can be grouped into opioid use activities, other drug use, health conditions, and demographics, while the predictive accuracy as high as nearly 80% was achieved. The findings from this research generated more hypotheses for future studies to reference. In chapter four, I performed differential gene expression and network analysis on mice with different oxycodone (an opioid receptor agonist)-induced behaviors and compared the significantly associated genes and network modules with top-ranked genes identified in humans. The pathway cross-talks and gene homologs identified from both species illuminate the potential molecular mechanism of opioid behaviors. In summary, this thesis utilized statistical genetics, machine learning, and a computational biology framework to address factors that are associative with opioid cessation in humans, and cross-referenced the genetic findings in a mouse model. These findings serve as references for future studies and provide a framework for personalizing the treatment of OUD

    Multi-omic biomarker discovery and network analyses to elucidate the molecular mechanisms of lung cancer premalignancy

    Get PDF
    Lung cancer (LC) is the leading cause of cancer death in the US, claiming over 160,000 lives annually. Although CT screening has been shown to be efficacious in reducing mortality, the limited access to screening programs among high-risk individuals and the high number of false positives contribute to low survival rates and increased healthcare costs. As a result, there is an urgent need for preventative therapeutics and novel interception biomarkers that would enhance current methods for detection of early-stage LC. This thesis addresses this challenge by examining the hypothesis that transcriptomic changes preceding the onset of LC can be identified by studying bronchial premalignant lesions (PMLs) and the normal-appearing airway epithelial cells altered in their presence (i.e., the PML-associated airway field of injury). PMLs are the presumed precursors of lung squamous cell carcinoma (SCC) whose presence indicates an increased risk of developing SCC and other subtypes of LC. Here, I leverage high-throughput mRNA and miRNA sequencing data from bronchial brushings and lesion biopsies to develop biomarkers of PML presence and progression, and to understand regulatory mechanisms driving early carcinogenesis. First, I utilized mRNA sequencing data from normal-appearing airway brushings to build a biomarker predictive of PML presence. After verifying the power of the 200-gene biomarker to detect the presence of PMLs, I evaluated its capacity to predict PML progression and detect presence of LC (Aim 1). Next, I identified likely regulatory mechanisms associated with PML severity and progression, by evaluating miRNA expression and gene coexpression modules containing their targets in bronchial lesion biopsies (Aim2). Lastly, I investigated the preservation of the PML-associated miRNAs and gene modules in the airway field of injury, highlighting an emergent link between the airway field and the PMLs (Aim 3). Overall, this thesis suggests a multi-faceted utility of PML-associated genomic signatures as markers for stratification of high-risk smokers in chemoprevention trials, markers for early detection of lung cancer, and novel chemopreventive targets, and yields valuable insights into early lung carcinogenesis by characterizing mRNA and miRNA expression alterations that contribute to premalignant disease progression towards LC.2020-01-2

    Transcriptomic Profiling in Mild Cognitive Impairment and Alzheimer's Disease Using Neuroimaging Endophenotypes

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Alzheimer’s disease (AD) is a devastating neurodegenerative disease affecting more than 6 million Americans and 50 million people worldwide currently. It is an irreversible neurodegenerative disease which causes decline in memory, cognition, personality, and other functions which eventually lead to death due to complete brain failure. Recently there has been a lot of research that has focused on enabling early intervention and disease prevention in AD which could have a significant impact on this disease, be crucial for life management, assessment of risk for future generations, and assistance in end-of-life preparation. For a late-life complex multifactorial disease, such as AD, where both genetic and environmental factors are involved, integrating multiple layers of genetic, imaging, and other biomarker data is a critical step for therapeutic discovery and building predictive risk assessment tools. The multifactorial nature of AD suggests that multiple therapeutic targets need to be identified and tested together. Hence, we need a systems-level approach to build biomarker profiles which can be used for drug discovery and screening/risk assessment. The research presented in this dissertation focuses on utilizing a systems level approach to identify promising imaging genetics biomarkers that provide insight into dysregulated biological pathways in AD pathogenesis and identify critical mRNA measures that can be investigated further within the scope of novel therapeutics, as well as input variables in predictive models for AD risk, screening, and diagnosis. The overall research goal was the development of systems level, imaging genetics biomarker signatures to serve as tools for risk analysis and therapeutic discovery in AD. The specific outcomes of the analyses were characterization of patterns in gene expression at systems level using neuroimaging endophenotypes, and identification of specific driver genes and genotypic variants, which can inform predictive modeling for diagnosis, risk, and pathogenic profiling in AD

    Identifying therapeutic targets in glioma using integrated network analysis

    Get PDF
    Gliomas are the most common brain tumours in adult population with rapid progression and poor prognosis. Survival among the patients diagnosed with the most aggressive histopathological subtype of gliomas, the glioblastoma, is a mere 12.6 months given the current standard of care. While glioblastomas mostly occur in people over 60, the lower-grade gliomas afflict themselves upon individuals in their third and fourth decades of life. Collectively, the gliomas are one of the major causes of cancer-related death in individuals under fortyin the UK. Over the past twenty years, little has changed in the standard of glioma treatment and the disease has remained incurable. This study focuses on identifying potential therapeutic targets in gliomasusing systems-level approaches and large-scale data integration.I used publicly available transcriptomic data to identify gene co-expression networks associated with the progression of IDH1-mutant 1p/19q euploid astrocytomas from grade II to grade III and high-lighted hub-genes of these networks, which could be targeted to modulate their biological function. I also studied the changes in co-expression patterns between grade II and grade III gliomas and identified a cluster of genes with differential co-expression in different disease states (module M2). By data integration and adaptation of reverse-engineering methods, I elucidated master regulators of the module M2. I then sought to counteract the regulatory activity by using drug-induced gene expression dataset to find compounds inducing gene expression in the opposite direction of the disease signature. I proposed resveratrol as a potentially disease modifying compound, which when administered to patients with a low-grade disease could potentially delay glioma progression.Finally, I appliedanensemble-learning algorithm on a large-scale loss-of-function viability screen in cancer cell-lines with different genetic backgrounds to identify gene dependencies associated with chromosomal copy-number losses common intheglioblastomas. I propose five novel target predictions to be validated in future experiments.Open acces

    Functional Analysis of Human Long Non-coding RNAs and Their Associations with Diseases

    Get PDF
    Within this study, we sought to leverage knowledge from well-characterized protein coding genes to characterize the lesser known long non-coding RNA (lncRNA) genes using computational methods to find functional annotations and disease associations. Functional genome annotation is an essential step to a systems-level view of the human genome. With this knowledge, we can gain a deeper understanding of how humans develop and function, and a better understanding of human disease. LncRNAs are transcripts greater than 200 nucleotides, which do not code for proteins. LncRNAs have been found to regulate development, tissue and cell differentiation, and organ formation. Their dysregulation has been linked to several diseases including autism spectrum disorder (ASD) and cancer. While a great deal of research has been dedicated to protein-coding genes, the relatively recently discovered lncRNA genes have yet to be characterized. LncRNA function is tied closely to when and where they are expressed. Co-expression network analysis offer a means of functional annotation of uncharacterized genes through a guilt by association approach. We have constructed two co-expression networks using known disease-associated protein-coding genes and lncRNA genes. Through clustering of the networks, gene set enrichment analysis, and centrality measures, we found enrichment for disease association and functions as well as identified high-confidence lncRNA disease gene targets. We present a novel approach to the identification of disease state associations by demonstrating genes that are associated with the same disease states share patterns that can be discerned from transcriptomes of healthy tissues. Using a machine learning algorithm, we built a model to classify ASD versus non-ASD genes using their expression profiles from healthy developing human brain tissues. Feature selection during the model-building process also identified critical temporospatial points for the determination of ASD genes. We constructed a webserver tool for the prioritization of genes for ASD association. The webserver tool has a database containing prioritization and co-expression information for nearly every gene in the human genome

    An Integrated, Module-based Biomarker Discovery Framework

    Get PDF
    Identification of biomarkers that contribute to complex human disorders is a principal and challenging task in computational biology. Prognostic biomarkers are useful for risk assessment of disease progression and patient stratification. Since treatment plans often hinge on patient stratification, better disease subtyping has the potential to significantly improve survival for patients. Additionally, a thorough understanding of the roles of biomarkers in cancer pathways facilitates insights into complex disease formation, and provides potential druggable targets in the pathways. Many statistical methods have been applied toward biomarker discovery, often combining feature selection with classification methods. Traditional approaches are mainly concerned with statistical significance and fail to consider the clinical relevance of the selected biomarkers. Two additional problems impede meaningful biomarker discovery: gene multiplicity (several maximally predictive solutions exist) and instability (inconsistent gene sets from different experiments or cross validation runs). Motivated by a need for more biologically informed, stable biomarker discovery method, I introduce an integrated module-based biomarker discovery framework for analyzing high- throughput genomic disease data. The proposed framework addresses the aforementioned challenges in three components. First, a recursive spectral clustering algorithm specifically 4 tailored toward high-dimensional, heterogeneous data (ReKS) is developed to partition genes into clusters that are treated as single entities for subsequent analysis. Next, the problems of gene multiplicity and instability are addressed through a group variable selection algorithm (T-ReCS) based on local causal discovery methods. Guided by the tree-like partition created from the clustering algorithm, this algorithm selects gene clusters that are predictive of a clinical outcome. We demonstrate that the group feature selection method facilitate the discovery of biologically relevant genes through their association with a statistically predictive driver. Finally, we elucidate the biological relevance of the biomarkers by leveraging available prior information to identify regulatory relationships between genes and between clusters, and deliver the information in the form of a user-friendly web server, mirConnX

    Ant Colony Optimization

    Get PDF
    Ant Colony Optimization (ACO) is the best example of how studies aimed at understanding and modeling the behavior of ants and other social insects can provide inspiration for the development of computational algorithms for the solution of difficult mathematical problems. Introduced by Marco Dorigo in his PhD thesis (1992) and initially applied to the travelling salesman problem, the ACO field has experienced a tremendous growth, standing today as an important nature-inspired stochastic metaheuristic for hard optimization problems. This book presents state-of-the-art ACO methods and is divided into two parts: (I) Techniques, which includes parallel implementations, and (II) Applications, where recent contributions of ACO to diverse fields, such as traffic congestion and control, structural optimization, manufacturing, and genomics are presented
    corecore