9 research outputs found

    Generalized adjacency and the conservation of gene clusters in genetic networks defined by synthetic lethals

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Given genetic networks derived from two genomes, it may be difficult to decide if their local structures are similar enough in both genomes to infer some ancestral configuration or some conserved functional relationships. Current methods all depend on searching for identical substructures.</p> <p>Methods</p> <p>We explore a generalized vertex proximity criterion, and present analytic and probability results for the comparison of random lattice networks.</p> <p>Results</p> <p>We apply this criterion to the comparison of the genetic networks of two evolutionarily divergent yeasts, <it>Saccharomyces cerevisiae </it>and <it>Schizosaccharomyces pombe</it>, derived using the Synthetic Genetic Array screen. We show that the overlapping parts of the networks of the two yeasts share a common structure beyond the shared edges. This may be due to their conservation of redundant pathways containing many synthetic lethal pairs of genes.</p> <p>Conclusions</p> <p>Detecting the shared generalized adjacency clusters in the genetic networks of the two yeasts show that this analytical construct can be a useful tool in probing conserved network structure across divergent genomes.</p

    Using interspecies biological networks to guide drug therapy

    Get PDF
    The use of drug combinations (DCs) in cancer therapy can prevent the development of drug resistance and decrease the severity and number of side effects. Synthetic lethality (SL), a genetic interaction wherein two nonessential genes cause cell death when knocked out simultaneously, has been suggested as a method of identifying novel DCs. A combination of two drugs that mimic genetic knockout may cause cellular death through a synthetic lethal pathway. Because SL can be context-specific, it may be possible to find DCs that target SL pairs in tumours while leaving healthy cells unscathed. However, elucidating all synthetic lethal pairs in humans would take more than 200 million experiments in a single biological context – an unmanageably large search space. It is thus necessary to develop computational methods to predict human SL. In this thesis, we develop connectivity homology, a novel measure of network similarity that allows for the comparison of interspecies protein-protein interaction networks. We then use this principle to develop Species-INdependent TRAnslation (SINaTRA), an algorithm that allows us to predict SL between species using protein-protein interaction networks. We validate it by predicting SL in S. pombe from S. cerevisiae, then generate over 100 million SINaTRA scores for putative human SL pairs. We use these data to predict new areas of cancer combination therapy, and then test fifteen of these predictions across several cell lines. Finally, in order to better understand synergy, we develop DAVISS (Data-driven Assessment of Variability In Synergy Scores), a novel way to statistically evaluate the significance of a drug interaction

    Exploiting patterns in genomic data for personalised cancer treatment and new target discovery

    Get PDF
    In response to a global requirement for improved cancer treatments a number of promising novel targeted cancer therapies are being developed that exploit vulnerabilities in cancer cells that are not present in healthy cells. In this thesis I explore different ways of identifying the vulnerabilities of cancer cells, with the ultimate aim of providing personalised therapies to cancer patients on an individual basis. I first investigate approaches that utilise the concept of synthetic lethality. Therapies that exploit synthetic lethality are suitable where a specific tumour suppressor has been inactivated by a cancer and an identified synthetic lethal (SSL) pair for that gene may be therapeutically targeted. Mainly due to the constraints of the experimental procedures, relatively few human SSL interactions have been identified. Here I describe computational systems approaches for predicting human SSL interactions by identifying and exploiting conserved patterns in protein-protein interaction (PPI) network topology both within and across model species. I report that my classifiers out-perform previous attempts to classify human SSL interactions. Experimental validation of my predictions suggest they may provide useful guidance for future SSL screenings and ultimately aid targeted cancer therapy development. All predictions from this study have been made available via a new online database that I designed, built and published. As an extension to this approach I used similar network features to predict gene dependencies, otherwise known as acquired essential genes, in specific cancer cell lines. Genetic alterations found in each individual cell line were modelled using the novel approach of removing protein nodes to reflect loss of function mutations and changing the weights of edges in each protein-protein interaction network to reflect gain of function mutations and gene expression changes. I report that base PPI networks can be used to successfully classify human cell line specific gene dependencies within individual cell lines, between cell lines and even across tissue types. Furthermore, my personalised PPI network models further improve prediction power and show improved sensitivity to rarer gene dependencies, an improvement which offers opportunities for personalised therapy. In a therapeutic context these essential genes would be suitable as individual drug targets for each specific patient. Finally, I analyse copy number variance and ploidy in a set of cancers from kidney patients. Using clustering algorithms I investigate patterns in cancer cell line arm-wise ploidy and identify factors that may be driving this genomic instability

    Knowledge derivation and data mining strategies for probabilistic functional integrated networks

    Get PDF
    PhDOne of the fundamental goals of systems biology is the experimental verification of the interactome: the entire complement of molecular interactions occurring in the cell. Vast amounts of high-throughput data have been produced to aid this effort. However these data are incomplete and contain high levels of both false positives and false negatives. In order to combat these limitations in data quality, computational techniques have been developed to evaluate the datasets and integrate them in a systematic fashion using graph theory. The result is an integrated network which can be analysed using a variety of network analysis techniques to draw new inferences about biological questions and to guide laboratory experiments. Individual research groups are interested in specific biological problems and, consequently, network analyses are normally performed with regard to a specific question. However, the majority of existing data integration techniques are global and do not focus on specific areas of biology. Currently this issue is addressed by using known annotation data (such as that from the Gene Ontology) to produce process-specific subnetworks. However, this approach discards useful information and is of limited use in poorly annotated areas of the interactome. Therefore, there is a need for network integration techniques that produce process-specific networks without loss of data. The work described here addresses this requirement by extending one of the most powerful integration techniques, probabilistic functional integrated networks (PFINs), to incorporate a concept of biological relevance. Initially, the available functional data for the baker’s yeast Saccharomyces cerevisiae was evaluated to identify areas of bias and specificity which could be exploited during network integration. This information was used to develop an integration technique which emphasises interactions relevant to specific biological questions, using yeast ageing as an exemplar. The integration method improves performance during network-based protein functional prediction in relation to this process. Further, the process-relevant networks complement classical network integration techniques and significantly improve network analysis in a wide range of biological processes. The method developed has been used to produce novel predictions for 505 Gene Ontology biological processes. Of these predictions 41,610 are consistent with existing computational annotations, and 906 are consistent with known expert-curated annotations. The approach significantly reduces the hypothesis space for experimental validation of genes hypothesised to be involved in the oxidative stress response. Therefore, incorporation of biological relevance into network integration can significantly improve network analysis with regard to individual biological questions

    Phylogenetics in the Genomic Era

    Get PDF
    Molecular phylogenetics was born in the middle of the 20th century, when the advent of protein and DNA sequencing offered a novel way to study the evolutionary relationships between living organisms. The first 50 years of the discipline can be seen as a long quest for resolving power. The goal – reconstructing the tree of life – seemed to be unreachable, the methods were heavily debated, and the data limiting. Maybe for these reasons, even the relevance of the whole approach was repeatedly questioned, as part of the so-called molecules versus morphology debate. Controversies often crystalized around long-standing conundrums, such as the origin of land plants, the diversification of placental mammals, or the prokaryote/eukaryote divide. Some of these questions were resolved as gene and species samples increased in size. Over the years, molecular phylogenetics has gradually evolved from a brilliant, revolutionary idea to a mature research field centred on the problem of reliably building trees. This logical progression was abruptly interrupted in the late 2000s. High-throughput sequencing arose and the field suddenly moved into something entirely different. Access to genome-scale data profoundly reshaped the methodological challenges, while opening an amazing range of new application perspectives. Phylogenetics left the realm of systematics to occupy a central place in one of the most exciting research fields of this century – genomics. This is what this book is about: how we do trees, and what we do with trees, in the current phylogenomic era. One obvious, practical consequence of the transition to genome-scale data is that the most widely used tree-building methods, which are based on probabilistic models of sequence evolution, require intensive algorithmic optimization to be applicable to current datasets. This problem is considered in Part 1 of the book, which includes a general introduction to Markov models (Chapter 1.1) and a detailed description of how to optimally design and implement Maximum Likelihood (Chapter 1.2) and Bayesian (Chapter 1.4) phylogenetic inference methods. The importance of the computational aspects of modern phylogenomics is such that efficient software development is a major activity of numerous research groups in the field. We acknowledge this and have included seven "How to" chapters presenting recent updates of major phylogenomic tools – RAxML (Chapter 1.3), PhyloBayes (Chapter 1.5), MACSE (Chapter 2.3), Bgee (Chapter 4.3), RevBayes (Chapter 5.2), Beagle (Chapter 5.4), and BPP (Chapter 5.6). Genome-scale data sets are so large that statistical power, which had been the main limiting factor of phylogenetic inference during previous decades, is no longer a major issue. Massive data sets instead tend to amplify the signal they deliver – be it biological or artefactual – so that bias and inconsistency, instead of sampling variance, are the main problems with phylogenetic inference in the genomic era. Part 2 covers the issues of data quality and model adequacy in phylogenomics. Chapter 2.1 provides an overview of current practice and makes recommendations on how to avoid the more common biases. Two chapters review the challenges and limitations of two key steps of phylogenomic analysis pipelines, sequence alignment (Chapter 2.2) and orthology prediction (Chapter 2.4), which largely determine the reliability of downstream inferences. The performance of tree building methods is also the subject of Chapter 2.5, in which a new approach is introduced to assess the quality of gene trees based on their ability to correctly predict ancestral gene order. Analyses of multiple genes typically recover multiple, distinct trees. Maybe the biggest conceptual advance induced by the phylogenetic to phylogenomic transition is the suggestion that one should not simply aim to reconstruct “the” species tree, but rather to be prepared to make sense of forests of gene trees. Chapter 3.1 reviews the numerous reasons why gene trees can differ from each other and from the species tree, and what the implications are for phylogenetic inference. Chapter 3.2 focuses on gene trees/species trees reconciliation methods that account for gene duplication/loss and horizontal gene transfer among lineages. Incomplete lineage sorting is another major source of phylogenetic incongruence among loci, which recently gained attention and is covered by Chapter 3.3. Chapter 3.4 concludes this part by taking a user’s perspective and examining the pros and cons of concatenation versus separate analysis of gene sequence alignments. Modern genomics is comparative and phylogenetic methods are key to a wide range of questions and analyses relevant to the study of molecular evolution. This is covered by Part 4. We argue that genome annotation, either structural or functional, can only be properly achieved in a phylogenetic context. Chapters 4.1 and 4.2 review the power of these approaches and their connections with the study of gene function. Molecular substitution rates play a key role in our understanding of the prevalence of nearly neutral versus adaptive molecular evolution, and the influence of species traits on genome dynamics (Chapter 4.4). The analysis of substitution rates, and particularly the detection of positive selection, requires sophisticated methods and models of coding sequence evolution (Chapter 4.5). Phylogenomics also offers a unique opportunity to explore evolutionary convergence at a molecular level, thus addressing the long-standing question of predictability versus contingency in evolution (Chapter 4.6). The development of phylogenomics, as reviewed in Parts 1 through 4, has resulted in a powerful conceptual and methodological corpus, which is often reused for addressing problems of interest to biologists from other fields. Part 5 illustrates this application potential via three selected examples. Chapter 5.1 addresses the link between phylogenomics and palaeontology; i.e., how to optimally combine molecular and fossil data for estimating divergence times. Chapter 5.3 emphasizes the importance of the phylogenomic approach in virology and its potential to trace the origin and spread of infectious diseases in space and time. Finally, Chapter 5.5 recalls why phylogenomic methods and the multi-species coalescent model are key in addressing the problem of species delimitation – one of the major goals of taxonomy. It is hard to predict where phylogenomics as a discipline will stand in even 10 years. Maybe a novel technological revolution will bring it to yet another level? We strongly believe, however, that tree thinking will remain pivotal in the treatment and interpretation of the deluge of genomic data to come. Perhaps a prefiguration of the future of our field is provided by the daily monitoring of the current Covid-19 outbreak via the phylogenetic analysis of coronavirus genomic data in quasi real time – a topic of major societal importance, contemporary to the publication of this book, in which phylogenomics is instrumental in helping to fight disease

    Ohio State University Bulletin

    Get PDF
    Classes available for students to enroll in during the 1974-1975 academic year for The Ohio State University

    Ohio State University Bulletin

    Get PDF
    Classes available for students to enroll in during the 1968-1969 academic year for The Ohio State University
    corecore