2,558 research outputs found

    Gene Expression : From Microarrays to Functional Genomics

    Get PDF
    The time of the large sequencing projects has enabled unprecedented possibilities of investigating more complex aspects of living organisms. Among the high-throughput technologies based on the genomic sequences, the DNA microarrays are widely used for many purposes, including the measurement of the relative quantity of the messenger RNAs. However, the reliability of microarrays has been strongly doubted as robust analysis of the complex microarray output data has been developed only after the technology had already been spread in the community. An objective of this study consisted of increasing the performance of microarrays, and was measured by the successful validation of the results by independent techniques. To this end, emphasis has been given to the possibility of selecting candidate genes with remarkable biological significance within specific experimental design. Along with literature evidence, the re-annotation of the probes and model-based normalization algorithms were found to be beneficial when analyzing Affymetrix GeneChip data. Typically, the analysis of microarrays aims at selecting genes whose expression is significantly different in different conditions followed by grouping them in functional categories, enabling a biological interpretation of the results. Another approach investigates the global differences in the expression of functionally related groups of genes. Here, this technique has been effective in discovering patterns related to temporal changes during infection of human cells. Another aspect explored in this thesis is related to the possibility of combining independent gene expression data for creating a catalog of genes that are selectively expressed in healthy human tissues. Not all the genes present in human cells are active; some involved in basic activities (named housekeeping genes) are expressed ubiquitously. Other genes (named tissue-selective genes) provide more specific functions and they are expressed preferably in certain cell types or tissues. Defining the tissue-selective genes is also important as these genes can cause disease with phenotype in the tissues where they are expressed. The hypothesis that gene expression could be used as a measure of the relatedness of the tissues has been also proved. Microarray experiments provide long lists of candidate genes that are often difficult to interpret and prioritize. Extending the power of microarray results is possible by inferring the relationships of genes under certain conditions. Gene transcription is constantly regulated by the coordinated binding of proteins, named transcription factors, to specific portions of the its promoter sequence. In this study, the analysis of promoters from groups of candidate genes has been utilized for predicting gene networks and highlighting modules of transcription factors playing a central role in the regulation of their transcription. Specific modules have been found regulating the expression of genes selectively expressed in the hippocampus, an area of the brain having a central role in the Major Depression Disorder. Similarly, gene networks derived from microarray results have elucidated aspects of the development of the mesencephalon, another region of the brain involved in Parkinson Disease.The time of the large sequencing projects has enabled unprecedented possibilities of investigating more complex aspects of living organisms. Among the high-throughput technologies based on the genomic sequences, the DNA microarrays are widely used for many purposes, including the measurement of the relative quantity of the messenger RNAs. However, the reliability of microarrays has been strongly doubted as robust analysis of the complex microarray output data has been developed only after the technology had already been spread in the community. An objective of this study consisted of increasing the performance of microarrays, and was measured by the successful validation of the results by independent techniques. To this end, emphasis has been given to the possibility of selecting candidate genes with remarkable biological significance within specific experimental design. Along with literature evidence, the re-annotation of the probes and model-based normalization algorithms were found to be beneficial when analyzing Affymetrix GeneChip data. Typically, the analysis of microarrays aims at selecting genes whose expression is significantly different in different conditions followed by grouping them in functional categories, enabling a biological interpretation of the results. Another approach investigates the global differences in the expression of functionally related groups of genes. Here, this technique has been effective in discovering patterns related to temporal changes during infection of human cells. Another aspect explored in this thesis is related to the possibility of combining independent gene expression data for creating a catalog of genes that are selectively expressed in healthy human tissues. Not all the genes present in human cells are active; some involved in basic activities (named housekeeping genes) are expressed ubiquitously. Other genes (named tissue-selective genes) provide more specific functions and they are expressed preferably in certain cell types or tissues. Defining the tissue-selective genes is also important as these genes can cause disease with phenotype in the tissues where they are expressed. The hypothesis that gene expression could be used as a measure of the relatedness of the tissues has been also proved. Microarray experiments provide long lists of candidate genes that are often difficult to interpret and prioritize. Extending the power of microarray results is possible by inferring the relationships of genes under certain conditions. Gene transcription is constantly regulated by the coordinated binding of proteins, named transcription factors, to specific portions of the its promoter sequence. In this study, the analysis of promoters from groups of candidate genes has been utilized for predicting gene networks and highlighting modules of transcription factors playing a central role in the regulation of their transcription. Specific modules have been found regulating the expression of genes selectively expressed in the hippocampus, an area of the brain having a central role in the Major Depression Disorder. Similarly, gene networks derived from microarray results have elucidated aspects of the development of the mesencephalon, another region of the brain involved in Parkinson Disease

    Preprocessing differential methylation hybridization microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>DNA methylation plays a very important role in the silencing of tumor suppressor genes in various tumor types. In order to gain a genome-wide understanding of how changes in methylation affect tumor growth, the differential methylation hybridization (DMH) protocol has been developed and large amounts of DMH microarray data have been generated. However, it is still unclear how to preprocess this type of microarray data and how different background correction and normalization methods used for two-color gene expression arrays perform for the methylation microarray data. In this paper, we demonstrate our discovery of a set of internal control probes that have log ratios (M) theoretically equal to zero according to this DMH protocol. With the aid of this set of control probes, we propose two LOESS (or LOWESS, locally weighted scatter-plot smoothing) normalization methods that are novel and unique for DMH microarray data. Combining with other normalization methods (global LOESS and no normalization), we compare four normalization methods. In addition, we compare five different background correction methods.</p> <p>Results</p> <p>We study 20 different preprocessing methods, which are the combination of five background correction methods and four normalization methods. In order to compare these 20 methods, we evaluate their performance of identifying known methylated and un-methylated housekeeping genes based on two statistics. Comparison details are illustrated using breast cancer cell line and ovarian cancer patient methylation microarray data. Our comparison results show that different background correction methods perform similarly; however, four normalization methods perform very differently. In particular, all three different LOESS normalization methods perform better than the one without any normalization.</p> <p>Conclusions</p> <p>It is necessary to do within-array normalization, and the two LOESS normalization methods based on specific DMH internal control probes produce more stable and relatively better results than the global LOESS normalization method.</p

    Repeated Small Perturbation Approach Reveals Transcriptomic Steady States

    Get PDF
    The study of biological systems dynamics requires elucidation of the transitions of steady states. A “small perturbation” approach can provide important information on the “steady state” of a biological system. In our experiments, small perturbations were generated by applying a series of repeating small doses of ultraviolet radiation to a human keratinocyte cell line, HaCaT. The biological response was assessed by monitoring the gene expression profiles using cDNA microarrays. Repeated small doses (10 J/m2) of ultraviolet B (UVB) exposure modulated the expression profiles of two groups of genes in opposite directions. The genes that were up-regulated have functions mainly associated with anti-proliferation/anti-mitogenesis/apoptosis, and the genes that were down-regulated were mainly related to proliferation/mitogenesis/anti-apoptosis. For both groups of genes, repetition of the small doses of UVB caused an immediate response followed by relaxation between successive small perturbations. This cyclic pattern was suppressed when large doses (233 or 582.5 J/m2) of UVB were applied. Our method and results contribute to a foundation for computational systems biology, which implicitly uses the concept of steady state

    Spot Detection and Image Segmentation in DNA Microarray Data

    Get PDF
    Following the invention of microarrays in 1994, the development and applications of this technology have grown exponentially. The numerous applications of microarray technology include clinical diagnosis and treatment, drug design and discovery, tumour detection, and environmental health research. One of the key issues in the experimental approaches utilising microarrays is to extract quantitative information from the spots, which represent genes in a given experiment. For this process, the initial stages are important and they influence future steps in the analysis. Identifying the spots and separating the background from the foreground is a fundamental problem in DNA microarray data analysis. In this review, we present an overview of state-of-the-art methods for microarray image segmentation. We discuss the foundations of the circle-shaped approach, adaptive shape segmentation, histogram-based methods and the recently introduced clustering-based techniques. We analytically show that clustering-based techniques are equivalent to the one-dimensional, standard k-means clustering algorithm that utilises the Euclidean distance

    Genome-wide gene expression surveys and a transcriptome map in chicken

    Get PDF
    The chicken (Gallus gallus) is an important model organism in genetics, developmental biology, immunology, evolutionary research, and agricultural science. The completeness of the draft chicken genome sequence provided new possibilities to study genomic changes during evolution by comparing the chicken genome to that of other species. The development of long oligonucleotide microarrays based on the genome sequence made it possible to survey genome-wide gene expression in chicken. This thesis describes two gene expression surveys across a range of healthy chicken tissues in both adult and embryonic stages. Specifically, we focus on the mechanisms of regulation of gene transcription and their evolution in the vertebrate genome. Chapter 1 provides a brief history of the chicken as a model organism in biological and genomics research. In particular a brief overview is presented about expression profiling experiments, followed by an introduction to gene transcription regulation in general. Finally, the aim and outline of this thesis is presented. An important aim of this thesis is to generate surveys of genome-wide gene expression data in chicken using microarrays. In chapter 2, we introduce microarray data normalization including background correction, within-array normalization and between-array normalization. Based on these results an analysis approach is recommended for the analysis of two-color microarray data as performed in the experiments described in this thesis. We also briefly explain the relevant methodology for the identification of differentially expressed genes and how to translate resulting gene lists into biological knowledge. Finally, specific issues related to updating microarray probe annotation in farm animals, is discussed. For the analysis of the microarray data in this thesis re-annotation of the probes on the chicken 20K oligoarray was done using the oligoRAP, analysis pipeline. The vast amount of data generated from a single transcriptomics study makes it impossible to extract meaningful biological knowledge by manually going through individual genes from a list with hundreds and thousands of differentially expressed genes. In chapter 3, we present a practical approach using a collection of R/Bioconductor packages to extract biological knowledge from a microarray experiment in farm animals. Furthermore, a locally adaptive statistical procedure (LAP) analysis approach is used to identify differentially expressed chromosomal regions in a microarray experiment. Chapter 4 presents a genome-wide gene expression survey across eight different tissues (brain, bursa of Fabricius, kidney, liver, lung, small intestine, spleen, and thymus from 10-week old chickens) in adult birds using a chicken 20K microarray. To a certain extent, most genes show some tissue-specific pattern of expression. Housekeeping and tissue-specific genes are identified based on gene expression patterns across the eight different tissues. The results show that housekeeping genes are more compact, i.e. are smaller, with shorter, coding sequence length, intron length, and smaller length of the intergenic regions. This observed compactness of housekeeping genes may be a result of selection on economy of transcription during evolution. Furthermore, a comparative analysis of gene expression among mouse, chicken, and frog showed that the expression patterns of orthologous genes are conserved during evolution between mammals, birds, and amphibians. The chicken embryo has been a very popular model for developmental biology. To study the overall gene expression pattern in whole chicken embryos at different developmental stages and/or embryonic tissues, a genome-wide gene expression survey across different developmental and embryonic stages was performed (chapter 5). The study included four different developmental stages (HH stage 3, 10, 15, 22) and eight different embryonic tissues (brain, bursa of Fabricius, heart, kidney, liver, lung, small intestine, and spleen from HH stage 36). We were able to identify several embryonic stage- and tissue-specific genes in our analysis. Genomic features of genes widely expressed under these 12 conditions suggest that widely expressed genes are more compact than tissue-specific genes, confirming the findings described in chapter 4. The analysis of the differentially expressed genes during the different developmental stages of whole embryo indicates a gradual change in gene expression during embryo development. A comparison of the gene expression profiles between the same organs, of adults and embryos reveals both striking similarities as well as differences. The overall goal of this thesis was to improve our understanding of the mechanisms of transcriptional regulation in the chicken. In chapter 6, a transcriptome map for all chicken chromosomes is presented based on the expression data described in chapter 4. The results reveal the presence of two distinct types of chromosomal regions characterized by clusters of highly or lowly expressed genes respectively. Furthermore, these regions show a high correlation with a number of genome characteristics, like gene density, gene length, intron length, and GC content. A comparative analysis between the chicken and human transcriptome maps suggests that the regions with clusters of highly expressed genes are relatively conserved between the two genomes. Our results revealed the presence of a higher order organization of the chicken genome that affects gene expression, confirming similar observations in other species. Finally, in chapter 7 I summarize the main findings and discuss some of the limitations of the analyses described in this thesis. I also discuss the different merits and shortcomings of studying gene expression using either microarrays or next-generation sequencing technology and propose directions for future research. The rapid developments in new-generation sequencing technology will facilitate better coverage and depth of the chicken genome. This will provide a better genome assembly and an improved genome annotation. The sequence-based approaches for studying gene expression will reduce noise levels compared to hybridization-based approaches. Overall, next-generation sequencing is already providing greatly enhance tools to further improve our understanding of the chicken transcriptome and its regulation. <br/

    Identifying differentially methylated genes using mixed effect and generalized least square models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>DNA methylation plays an important role in the process of tumorigenesis. Identifying differentially methylated genes or CpG islands (CGIs) associated with genes between two tumor subtypes is thus an important biological question. The methylation status of all CGIs in the whole genome can be assayed with differential methylation hybridization (DMH) microarrays. However, patient samples or cell lines are heterogeneous, so their methylation pattern may be very different. In addition, neighboring probes at each CGI are correlated. How these factors affect the analysis of DMH data is unknown.</p> <p>Results</p> <p>We propose a new method for identifying differentially methylated (DM) genes by identifying the associated DM CGI(s). At each CGI, we implement four different mixed effect and generalized least square models to identify DM genes between two groups. We compare four models with a simple least square regression model to study the impact of incorporating random effects and correlations.</p> <p>Conclusions</p> <p>We demonstrate that the inclusion (or exclusion) of random effects and the choice of correlation structures can significantly affect the results of the data analysis. We also assess the false discovery rate of different models using CGIs associated with housekeeping genes.</p

    Intra- and inter-individual genetic differences in gene expression

    Get PDF
    Genetic variation is known to influence the amount of mRNA produced by a gene. Given that the molecular machines control mRNA levels of multiple genes, we expect genetic variation in the components of these machines would influence multiple genes in a similar fashion. In this study we show that this assumption is correct by using correlation of mRNA levels measured independently in the brain, kidney or liver of multiple, genetically typed, mice strains to detect shared genetic influences. These correlating groups of genes (CGG) have collective properties that account for 40-90% of the variability of their constituent genes and in some cases, but not all, contain genes encoding functionally related proteins. Critically, we show that the genetic influences are essentially tissue specific and consequently the same genetic variations in the one animal may up-regulate a CGG in one tissue but down-regulate the same CGG in a second tissue. We further show similarly paradoxical behaviour of CGGs within the same tissues of different individuals. The implication of this study is that this class of genetic variation can result in complex inter- and intra-individual and tissue differences and that this will create substantial challenges to the investigation of phenotypic outcomes, particularly in humans where multiple tissues are not readily available.&#xd;&#xa;&#xd;&#xa

    Circadian Clocks Are Resounding in Peripheral Tissues

    Get PDF
    Circadian rhythms are prevalent in most organisms. Even the smallest disturbances in the orchestration of circadian gene expression patterns among different tissues can result in functional asynchrony, at the organism level, and may to contribute to a wide range of physiologic disorders. It has been reported that as many as 5%–10% of transcribed genes in peripheral tissues follow a circadian expression pattern. We have conducted a comprehensive study of circadian gene expression on a large dataset representing three different peripheral tissues. The data have been produced in a large-scale microarray experiment covering replicate daily cycles in murine white and brown adipose tissues as well as in liver. We have applied three alternative algorithmic approaches to identify circadian oscillation in time series expression profiles. Analyses of our own data indicate that the expression of at least 7% to 21% of active genes in mouse liver, and in white and brown adipose tissues follow a daily oscillatory pattern. Indeed, analysis of data from other laboratories suggests that the percentage of genes with an oscillatory pattern may approach 50% in the liver. For the rest of the genes, oscillation appears to be obscured by stochastic noise. Our phase classification and computer simulation studies based on multiple datasets indicate no detectable boundary between oscillating and non-oscillating fractions of genes. We conclude that greater attention should be given to the potential influence of circadian mechanisms on any biological pathway related to metabolism and obesity
    corecore