168 research outputs found

    Extracting information from high-throughput gene expression data with pathway analysis and deconvolution

    Get PDF
    Modern technologies allow for the collection of large biological datasets that can be utilised for diverse health-related applications. However, to extract useful information from such data, computational methods are needed. The field that develops and explores methods to analyse biological data is called bioinformatics. In this thesis I evaluate different bioinformatic methods and introduce novel ones related to processing gene expression data. Gene expression data reflects how active different genes are in a set of measured biological samples. These samples can be for example blood from human individuals, tissue samples from tumours and the corresponding healthy tissue, or brain samples from mice with different neural diseases. This thesis covers two topics, pathway analysis and deconvolution, related to downstream analysis of gene expression data. Notably, this summary does not repeat in detail the same points made in the original publications, but aims to provide a comprehensive overview of the current knowledge of the two wider topics. The original publications focus on comparing and evaluating the available methods as well as presenting new ones that cover some previously untouched features. While the terms ’pathway analysis’ and ’deconvolution’ have been used with alternative definitions in other fields, in the context of this thesis, pathway analysis refers to estimating the activity of pathways, i.e. interaction networks body uses to react to different signals, based on given gene expression data and structural information of the relevant pathways. I focus on different types of analysis methods and their varying goals, requirements, and underlying statistical approaches. In addition, the strengths and weaknesses of the concept of pathway analysis are briefly discussed. The first two original publications I and II empirically compare different types of pathway methods and introduce a novel one. In the paper I, the tested methods are evaluated from different perspectives, and in the paper II, a novel method is introduced and its performance demonstrated against alternative tools. Many biological samples contain a variety of cell types and here, deconvolution means computationally extracting cell type composition or cell type specific expression from bulk samples. The deconvolution sections of this thesis also focus on a general overview of the topic and the available computational methodology. As deconvolution is challenging, I discuss the factors affecting its accuracy as well as alternative wet lab approaches to obtain cell type specific information. The first original publication about deconvolution (publication III) introduces a novel method and evaluates it against the other available tools. The second (publication IV) focuses on identifying cell type specific differences between sample groups, which is a particularly difficult task

    Why Mosquitoes Bite Some People More Than Others: Metabolic Correlates of Human Attraction in AEDES AEGYPTI

    Get PDF
    Aedes aegypti mosquitoes are the principal vectors of two major infectious diseases that plague the developing world today: dengue fever and chikungunya, with dengue fever alone resulting in ~400 million total yearly infections, and ~24,000 deaths (Bhatt et al., 2013). Understanding the biology behind Ae. aegypti attraction to humans is critical for developing novel strategies to combat these diseases. Yet, even the basic act of how mosquitoes choose one human host over another is poorly understood. Many previous studies on differential attraction have focused on small, homogenous subject populations and addressed a single hypothesis. We took the opposite strategy and studied a large, diverse 150-subject cohort, capturing a multitude of variables that may be involved in host selection. Importantly, our study examined the previously unexplored possibility that mosquito preference may be correlated with differences blood metabolites between subjects. We developed the uniport olfactometer as a method for discriminating subject attraction. Within our study population we distinguished three clusters of subjects who were differentially attractive to mosquitoes. We performed metabolic profiling with subject plasma samples and acquired relative concentrations of 613 different metabolites. We also collected information pertaining to 41 other variables including demographic information, self-reported lifestyle factors, self-reported reaction to mosquito bites, vital signs, blood type, a complete blood count panel, and clinical blood analysis. Using a variety of statistical methods for feature selection, we narrowed this list of variables and arrived at two preliminary models for mosquito attraction. These models explain 24.1% of subject variation in mosquito attraction, and approximately 19.7% of this explanatory power is due to blood metabolites alone. Metabolites within the amino acid superpathway, and specifically the histidine subpathway were negatively correlated with mosquito attraction. Conversely, molecules within the lipid metabolism superpathway, specifically long chain fatty acids and monoacylglycerols, were positively correlated with mosquito attraction. This is the first study to correlate human blood metabolomic components with selective attraction of mosquitoes to hosts. Our work establishes a framework to study the causality of these correlates, and determine the mechanisms underlying their effect on mosquito choice

    Integrated Proteomic and Genomic Analysis of Gastric Cancer Patient Tissues

    Get PDF
    V-erb-b2 erythroblastic leukemia viral oncogene homologue 2, known as ERBB2, is an important oncogene in the development of certain cancers. It can form a heterodimer with other epidermal growth factor receptor family members and activate kinase-mediated downstream signaling pathways. ERBB2 gene is located on chromosome 17 and is amplified in a subset of cancers, such as breast, gastric, and colon cancer. Of particular interest to the Chromosome-Centric Human Proteome Project (C-HPP) initiative is the amplification mechanism that typically results in overexpression of a set of genes adjacent to ERBB2, which provides evidence of a linkage between gene location and expression. In this report we studied patient samples from ERBB2-positive together with adjacent control nontumor tissues. In addition, non-ERBB2-expressing patient samples were selected as comparison to study the effect of expression of this oncogene. We detected 196 proteins in ERBB2-positive patient tumor samples that had minimal overlap (29 proteins) with the non-ERBB2 tumor samples. Interaction and pathway analysis identified extracellular signal regulated kinase (ERK) cascade and actin polymerization and actinmyosin assembly contraction as pathways of importance in ERBB2+ and ERBB2- gastric cancer samples, respectively. The raw data files are deposited at ProteomeXchange (identifier: PXD002674) as well as GPMDB.ope

    Global analysis of HBV-mediated changes to the primary hepatocyte transcriptome and metabolome

    Get PDF
    Chronic infection with hepatitis B virus (HBV) remains a significant health concern, with between 350-500 million people chronically infected worldwide. Approximately 25% of chronically infected individuals will go on to develop HBVassociated hepatocellular carcinoma (HCC), making chronic infection with HBV the leading risk factor for developing HCC. With the high incidence and mortality of HCC, it is important to fully understand the mechanisms that lead to the development of HBV-associated HCC. HBV requires a complex network of hostvirus interactions to meet its requirements for successful replication. Each of these host-virus interactions, over a decades-long chronic infection, could significantly alter the physiology of an infected hepatocyte, the target of HBV infection, and ultimately contribute to the oncogenic potential of HBV. Typically, studies of these host-virus interactions have focused on a single factor or pathway to characterize contributions to HBV replication or HBV-associated disease, but these studies have generally not considered hepatocyte physiology as a whole. We hypothesized that using broad, transcriptomic- and metabolomic -based technologies would allow us to establish a better understanding of the complexity of the host-virus interaction mediated by HBV and how an HBV infection affects overall hepatocyte physiology. To achieve this, we utilized an ex-vivo primary rat hepatocyte model and defined transcriptome-wide, HBV-mediated changes to gene expression. We also utilized metabolomic profiling to assess the impact of HBV, and the HBV X protein (HBx), on overall hepatocyte metabolism and correlated these changes to HBV-mediated changes in gene expression. Using this approach, we identified significant alterations of many genes and pathways central to hepatocyte physiology, including cell cycle regulation, lipid metabolism, and energy metabolism. Our results simultaneously identified multiple HBV-mediated changes to hepatocyte physiology, increasing our understanding of the complex relationship between HBV and an infected hepatocyte. Additionally, the identification of HBV-regulated genes will serve as the basis for future studies in understanding the physiological impact of an HBV infection. Together, these findings will allow a better understanding of HBV-mediated affects on hepatocyte physiology that could ultimately contribute to the development of HBV-associated disease, potentially guiding the generation of novel therapeutics and strategies to prevent HBV-associated HCC.Ph.D., Microbiology and Immunology -- Drexel University, 201

    DNA POLYMERASE θ (POLQ) AND THE CELLULAR DEFENSE AGAINST DNA DAMAGE

    Get PDF
    In mammalian cells, DNA polymerase θ (POLQ) is an unusual specialized DNA polymerase whose in vivo function is under active investigation. The protein is comprised of an N-terminal helicase-like domain, a C-terminal DNA polymerase domain, and a large central domain that spans between the two. This arrangement is also found in the Drosophila Mus308 protein, which helps confer resistance to DNA interstrand crosslinking agents. Homologs of POLQ and Mus308 are found in eukaryotes, including plants, but a comparison of phenotypes suggests that not all of these genes are functional orthologs. Flies with defective Mus308 are sensitive to DNA interstrand crosslinking agents, while mammalian cells with defective POLQ are primarily sensitive to DNA double-strand breaking agents. Cells from Polq-null mice are hypersensitive to radiation and the peripheral blood cells of these mice display increased spontaneous and ionizing radiation-induced levels of micronuclei (a hallmark of gross chromosomal aberrations), though the mice apparently develop normally. Although a defect in the DNA polymerase POLQ leads to ionizing radiation sensitivity in mammalian cells, the relevant enzymatic pathway has not been identified. Here we define the specific mechanism by which POLQ restricts harmful DNA instability. Our experiments show that Polq-null murine cells are selectively hypersensitive to DNA strand-breaking agents, and that damage resistance requires the DNA polymerase activity of POLQ. Using a DNA break end joining assay in cells, the repair of DNA ends with long 3′ single-stranded overhangs was monitored. End joining events that retained much of the overhang were dependent on POLQ, and independent of Ku70. To analyze this repair function in more detail, immunoglobulin class switch joining between DNA segments in antibody genes was examined. POLQ participates in the end joining of a DNA break during immunoglobulin class-switching, producing insertions of base pairs at the joins with homology to IgH switch-region sequences. Biochemical experiments with purified human POLQ protein revealed the mechanism generating the insertions during DNA end joining, relying on the unique ability of POLQ to extend DNA from minimally paired primers. DNA breaks at the IgH locus can sometimes join with breaks in Myc, creating a chromosome translocation. A marked increase in Myc/IgH translocations was observed in Polq-defective mice, showing that POLQ suppresses genomic instability and genome rearrangements originating at DNA double-strand breaks. This work clearly defines a role and mechanism for mammalian POLQ in an alternative end joining pathway (termed synthesis-dependent end joining) that suppresses the formation of chromosomal translocations. Our findings depart from the prevailing view that alternative end joining processes are generically translocation-prone. Class switch and junction analysis was also performed in mice lacking POLN, another DNA polymerase related to POLQ. I observed that POLN does not operate in the same alternative end joining pathway as does POLQ. Loss of Poln does not enhance the DNA damage hypersensitivity seen in cells lacking Polq. These findings suggest that while these two polymerases are structurally related they appear to have distinct functions in the cell. Analysis of the Poln phenotype is still ongoing. Further analysis of POLN and POLQ is required to clarify the mechanism by which they function in the cell

    Metabolomic Biomarkers of Prostate Cancer: Prediction, Diagnosis, Progression, Prognosis, and Recurrence

    Get PDF
    Metabolite profiling is being increasing employed in the study of prostate cancer as a means of identifying predictive, diagnostic, and prognostic biomarkers. This review provides a summary and critique of the current literature. Thirty-three human case-control studies of prostate cancer exploring disease prediction, diagnosis, progression, or treatment response were identified. All but one demonstrated the ability of metabolite profiling to distinguish cancer from benign, tumor aggressiveness, cases who recurred, and those who responded well to therapy. In the subset of studies where biomarker discriminatory ability was quantified, high AUCs were reported that would potentially outperform the current gold standards in diagnosis, prognosis, and disease recurrence, including PSA testing. There were substantial similarities between the metabolites and the associated pathways reported as significant by independent studies, and important roles for abnormal cell growth, intensive cell proliferation, and dysregulation of lipid metabolism were highlighted. The weight of the evidence therefore suggests metabolic alterations specific to prostate carcinogenesis and progression that may represent potential metabolic biomarkers. However, replication and validation of the most promising biomarkers is currently lacking and a number of outstanding methodologic issues remain to be addressed to maximize the utility of metabolomics in the study of prostate cancer.National Institutes of Health (U.S.) (Grant P01 CA055075)National Institutes of Health (U.S.) (Grant CA133891)National Institutes of Health (U.S.) (Grant CA141298)National Institutes of Health (U.S.) (Grant CA136578)National Institutes of Health (U.S.) (Grant UM1 CA167552

    Drosophila, metabolomics and insecticide action

    Get PDF
    The growing problem of insecticide resistance is jeopardising current pest control strategies and current insecticide development pipelines are failing to provide new alternatives quickly enough. Metabolomics offers a potential solution to the bottleneck in insecticide target discovery. As a proof of concept, metabolomics data for permethrin exposed Drosophila melanogaster was analysed and interpreted. Changes in the metabolism of amino acids, glycogen, glycolysis, energy, nitrogen, NAD+, purine, pyrimidine, lipids and carnitine were observed along with markers for acidosis, ammonia stress, oxidative stress and detoxification responses. Many of the changed metabolites and pathways had never been linked to permethrin exposure before. A model for the interaction of the observed changes in metabolites was proposed. From the metabolic pathways with the largest changes, candidate genes from tryptophan catabolism were selected to determine if the perturbed pathways had an effect on survival when exposed to permethrin. Using QPCR it was found that all genes in the entire pathway were downregulated by permethrin exposure with the exception of vermilion suggesting an active response to try and limit flux through tryptophan catabolism during permethrin exposure. Knockdown of the tryptophan catabolising genes vermilion, cinnabar and CG6950 in Drosophila using whole fly RNAi resulted in changes in susceptibility to permethrin for both topical and oral routes of exposure. Knockdown of the candidate genes also caused changes in susceptibility when the insecticides fenvalerate, DDT, chlorpyriphos and hydramethylnon were orally administered. These results show that tryptophan catabolism knockdown has an effect on surviving insecticides with a broad range in mode of action. Symptoms that occur in Drosophila during exposure to the different insecticides were also noted. To gain further understanding into the mechanisms affecting survival, tissue specific knockdown was performed revealing tissue and gender specific changes in survival when vermilion, cinnabar and CG6950 are knocked down. Metabolomics was performed on the knockdown strains to determine the efficacy of the knockdowns on tryptophan catabolism and to identify any knock-on effects. The results indicate that tryptophan metabolite induced perturbations to energy metabolism and glycosylation also occur in Drosophila along with apparent changes in the absorption of ectometabolites. As the knockdown of vermilion, cinnabar and CG6950 tended to result in reduced susceptibility to insecticides, they would make poor targets for insecticidal compounds, however, they may be the first examples of genes that are not directly involved in insecticide metabolism or cuticle synthesis that increase insecticide tolerance in Drosophila. As the first metabolomics data set showed evidence for oxidative stress during permethrin exposure, preliminary work was begun for identifying the tissue specificity and timing of oxidative stress in both Dipterans and Lepidopterans using Drosophila and Bombyx mori as models. In Drosophila oxidative stress did not begin immediately suggesting that the insecticide itself is not a cause, however, a rapid increase in oxidative stress occured over a six hour period after a day of oral exposure implicating catabolites of permethrin. Bombyx were highly susceptible to permethrin showing oxidative stress in the Malpighian tubule and silk gland when exposed. This study has shown that metabolomics is highly effective at identifying pathways which modulate survival to insecticide exposure. It has also brought insight into how insecticide induced pathology may cause death. Data has also been generated which could help characterize the putative transaminase CG6950

    Understanding pathways

    No full text
    The challenge with todays microarray experiments is to infer biological conclusions from them. There are two crucial difficulties to be surmounted in this challenge:(1) A lack of suitable biological repository that can be easily integrated into computational algorithms. (2) Contemporary algorithms used to analyze microarray data are unable to draw consistent biological results from diverse datasets of the same disease. To deal with the first difficulty, we believe a core database that unifies available biological repositories is important. Towards this end, we create a unified biological database from three popular biological repositories (KEGG, Ingenuity and Wikipathways). This database provides computer scientists the flexibility of easily integrating biological information using simple API calls or SQL queries. To deal with the second difficulty of deriving consistent biological results from the experiments, we first conceptualize the notion of “subnetworks”, which refers to a connected portion in a biological pathway. Then we propose a method that identifies subnetworks that are consistently expressed by patients of he same disease phenotype. We test our technique on independent datasets of several diseases, including ALL, DMD and lung cancer. For each of these diseases, we obtain two independent microarray datasets produced by distinct labs on distinct platforms. In each case, our technique consistently produces overlapping lists of significant nontrivial subnetworks from two independent sets of microarray data. The gene-level agreement of these significant subnetworks is between 66.67% to 91.87%. In contrast, when the same pairs of microarray datasets were analysed using GSEA and t-test, this percentage fell between 37% to 55.75% (GSEA) and between 2.55% to 19.23% (t-test). Furthermore, the genes selected using GSEA and t-test do not form subnetworks of substantial size. Thus it is more probable that the subnetworks selected by our technique can provide the researcher with more descriptive information on the portions of the pathway which actually associates with the disease. Keywords: pathway analysis, microarra

    Evolution and genetics of antiviral immunity in Drosophila

    Get PDF
    Virus-host interactions determine virus transmissibility and virulence, and underlie coevolution that shapes interesting biological phenomena such as the genetic architecture of host resistance and host range. Characterization of the virus factors that exert selective pressure on the host, and the host genes which underlie resistance and adaptation against viruses will help to define the mechanistic pathways embroiled in host-virus coevolution. In this thesis, I describe the viral causes and host consequences of host-virus coevolution. These include genomic signatures consistent with antagonistic coevolution in antiviral RNA interference pathway genes such as high rates of positive selection and polymorphism, loci that underlie genetic variation in resistance to virus infection, and apparent conflict between NF-κB signalling and DNA virus infection. The RNA interference (RNAi) pathway is the most general innate immune pathway in insects, underlined by the observation that many viruses encode suppressors of RNAi (VSRs). The relationship between RNAi and VSRs has garnered attention as a plausible battleground for host-virus antagonistic coevolution, and genomic patterns in Drosophila support this hypothesis. However, genomic patterns in the N-terminal domain of the key RNAi effector gene, Argonaute-2, have not been described. In Chapter 2, I sequence the Argonaute-2 N-terminal domain using PacBio long-read sequencing technology to describe variation within and across Drosophila species, and test whether this variation is associated with resistance to Drosophila C Virus. The RNAi pathway evolves adaptively in Drosophila, but this has not been formally extended across invertebrate species. In Chapter 3, I quantify rates of adaptive protein evolution and describe evidence for selective sweeps in RNAi pathway genes using population genomic data from 8 insect and nematode species. These analyses indicate that RNAi genes involved in suppression of transposable elements and defence against viruses evolve rapidly across invertebrates, and I identify genes with signatures of elevated adaptation in multiple insect species. Host genes that underlie host-virus interactions have been described in RNA virus infection of Drosophila, however substantially less attention has focussed on the host response to DNA viruses, primarily because no DNA viruses have been isolated from Drosophila. In Chapter 4, I describe the isolation of Kallithea virus, a Drosophila dsDNA nudivirus, and characterise the host response to infection and genetic variation in resistance. I find that Kallithea virus infection causes early male-specific lethality, a cessation of oogenesis, and induction of undescribed virus-responsive genes. Further, I describe genetic variation in resistance and tolerance to Kallithea virus infection, and identify a potential causal variant for virus-induced mortality in Cip4. Insect viruses commonly encode viral suppressors of RNAi, however there are a multitude of antiviral immune mechanisms besides RNAi which may select for viral-encoded inhibitors. In Chapter 5, I describe the requirement for RNAi and NF-κB in immunity against Kallithea virus, and map gp83 as a virus-encoded inhibitor of NF-κB signalling. I find that gp83 inhibits Toll signalling at the level of, or downstream of NF-κB transcription factors, and that this immunosuppressive function is conserved in other nudiviruses
    corecore