2,477 research outputs found

    High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether to increase sample size for a more powerful analysis of a specific disease (e.g. lung cancer) or to re-examine experiments for reasons different than those examined in the initial, publishing study that generated them. For the average biomedical researcher, there are a number of practical barriers to conducting such meta-analyses such as manually aggregating, filtering and formatting the data. Methods to automatically process large repositories of microarray data into a standardized, directly comparable format will enable easier and more reliable access to microarray data to conduct meta-analyses.</p> <p>Methods</p> <p>We present a straightforward, simple but robust against potential outliers method for automatic quality control and pre-processing of tens of thousands of single-channel microarray data files. GEO GDS files are quality checked by comparing parametric distributions and quantile normalized to enable direct comparison of expression level for subsequent meta-analyses.</p> <p>Results</p> <p>13,000 human 1-color experiments were processed to create a single gene expression matrix that subsets can be extracted from to conduct meta-analyses. Interestingly, we found that when conducting a global meta-analysis of gene-gene co-expression patterns across all 13,000 experiments to predict gene function, normalization had minimal improvement over using the raw data.</p> <p>Conclusions</p> <p>Normalization of microarray data appears to be of minimal importance on analyses based on co-expression patterns when the sample size is on the order of thousands microarray datasets. Smaller subsets, however, are more prone to aberrations and artefacts, and effective means of automating normalization procedures not only empowers meta-analytic approaches, but aids in reproducibility by providing a standard way of approaching the problem.</p> <p>Data availability: matrix containing normalized expression of 20,813 genes across 13,000 experiments is available for download at . Source code for GDS files pre-processing is available from the authors upon request.</p

    Integrated genomics and proteomics define huntingtin CAG length-dependent networks in mice.

    Get PDF
    To gain insight into how mutant huntingtin (mHtt) CAG repeat length modifies Huntington's disease (HD) pathogenesis, we profiled mRNA in over 600 brain and peripheral tissue samples from HD knock-in mice with increasing CAG repeat lengths. We found repeat length-dependent transcriptional signatures to be prominent in the striatum, less so in cortex, and minimal in the liver. Coexpression network analyses revealed 13 striatal and 5 cortical modules that correlated highly with CAG length and age, and that were preserved in HD models and sometimes in patients. Top striatal modules implicated mHtt CAG length and age in graded impairment in the expression of identity genes for striatal medium spiny neurons and in dysregulation of cyclic AMP signaling, cell death and protocadherin genes. We used proteomics to confirm 790 genes and 5 striatal modules with CAG length-dependent dysregulation at the protein level, and validated 22 striatal module genes as modifiers of mHtt toxicities in vivo

    Gene expression profiling in acute myeloid leukemia

    Get PDF

    Gene Expression Commons: an open platform for absolute gene expression profiling.

    Get PDF
    Gene expression profiling using microarrays has been limited to comparisons of gene expression between small numbers of samples within individual experiments. However, the unknown and variable sensitivities of each probeset have rendered the absolute expression of any given gene nearly impossible to estimate. We have overcome this limitation by using a very large number (&gt;10,000) of varied microarray data as a common reference, so that statistical attributes of each probeset, such as the dynamic range and threshold between low and high expression, can be reliably discovered through meta-analysis. This strategy is implemented in a web-based platform named "Gene Expression Commons" (https://gexc.stanford.edu/) which contains data of 39 distinct highly purified mouse hematopoietic stem/progenitor/differentiated cell populations covering almost the entire hematopoietic system. Since the Gene Expression Commons is designed as an open platform, investigators can explore the expression level of any gene, search by expression patterns of interest, submit their own microarray data, and design their own working models representing biological relationship among samples

    Meta-analysis of muscle transcriptome data using the MADMuscle database reveals biologically relevant gene patterns

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>DNA microarray technology has had a great impact on muscle research and microarray gene expression data has been widely used to identify gene signatures characteristic of the studied conditions. With the rapid accumulation of muscle microarray data, it is of great interest to understand how to compare and combine data across multiple studies. Meta-analysis of transcriptome data is a valuable method to achieve it. It enables to highlight conserved gene signatures between multiple independent studies. However, using it is made difficult by the diversity of the available data: different microarray platforms, different gene nomenclature, different species studied, etc.</p> <p>Description</p> <p>We have developed a system tool dedicated to muscle transcriptome data. This system comprises a collection of microarray data as well as a query tool. This latter allows the user to extract similar clusters of co-expressed genes from the database, using an input gene list. Common and relevant gene signatures can thus be searched more easily. The dedicated database consists in a large compendium of public data (more than 500 data sets) related to muscle (skeletal and heart). These studies included seven different animal species from invertebrates (<it>Drosophila melanogaster, Caenorhabditis elegans</it>) and vertebrates (<it>Homo sapiens, Mus musculus, Rattus norvegicus, Canis familiaris, Gallus gallus</it>). After a renormalization step, clusters of co-expressed genes were identified in each dataset. The lists of co-expressed genes were annotated using a unified re-annotation procedure. These gene lists were compared to find significant overlaps between studies.</p> <p>Conclusions</p> <p>Applied to this large compendium of data sets, meta-analyses demonstrated that conserved patterns between species could be identified. Focusing on a specific pathology (Duchenne Muscular Dystrophy) we validated results across independent studies and revealed robust biomarkers and new pathways of interest. The meta-analyses performed with MADMuscle show the usefulness of this approach. Our method can be applied to all public transcriptome data.</p

    Genome-Wide Decoding of mRNP and miRNA Maps

    Get PDF
    The limited number of primary transcripts in the genome has promoted interest in the possibility that much of the complexity in the regulation of gene expression may be determined by RNA regulation controlled by RNA-binding proteins (RNABPs) and/or microRNAs (miRNAs). However, applying biochemical methods to understand such interactions in living tissues is major challenge. Here we developed a genome-wide means of mapping messenger ribonucleoprotein (mRNP) sites in vivo, by high-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP). HITS-CLIP analysis of the neuron-specific splicing factor Nova provides genome-wide maps of Nova-RNA interactions in vivo and leads to a new finding that Nova may regulate the processesing of some miRNAs. Furthermore, HITS-CLIP analysis is extended to the problem of identifying miRNA targets, for which prediction is a major challenge since miRNA activity requires base pairing through only 6-8 “seed†nucleotides. By generating crosslinking of native Argonaute (Ago) protein-RNA complexes in mouse brain, Ago HITS-CLIP produced two simultaneous datasets—Ago-miRNA and Ago-mRNA binding sites—that were combined with bioinformatic analysis to identify miRNA-target mRNA interaction sites. We validated genome-wide interaction maps for miR-124, and generated additional maps for the 20 most abundant miRNAs present in P13 mouse brain. We also found that the relatively large number of Ago proteins bind in coding sequence, as well as introns, suggesting unexplored functions for miRNAs. Not all Ago mRNA clusters correspond to known seed sequence, leading to the discovery of putative new rules for miRNA-mRNA interactions. HITS-CLIP provides a general plaform to identify functional mRNP and miRNA binding sites in vivo and a solution to determining precise sequences for targeting clinically relevant sites of RNA regulation. In addition, overlaying mRNP maps with miRNA maps will be informative for the understanding of RNA regulations and complexity

    Transcriptional Profiling of Organ‐Specific Autoimmunity in Human

    Get PDF
    Our understanding of the pathogenesis of organ‐specific autoinflammation has been restricted by limited access to the target organs. Peripheral blood, however, as a preferred transportation route for immune cells, provides a window to assess the entire immune system throughout the body. Transcriptional profiling with RNA stabilizing blood collection tubes reflects in vivo expression profiles at the time the blood is drawn, allowing detection of the disease activity in different samples or within the same sample over time. The main objective of this Ph.D. study was to apply gene‐expression microarrays in the characterization of peripheral blood transcriptional profiles in patients with autoimmune diseases. To achieve this goal a custom cDNA microarray targeted for gene‐expression profiling of human immune system was designed and produced. Sample collection and preparation was then optimized to allow gene‐expression profiling from whole‐blood samples. To overcome challenges resulting from minute amounts of sample material, RNA amplification was successfully applied to study pregnancy related immunosuppression in patients with multiple sclerosis (MS). Furthermore, similar sample preparation was applied to characterize longitudinal genome‐wide expression profiles in children with type 1 diabetes (T1D) associated autoantibodies and eventually clinical T1D. Blood transcriptome analyses, using both the ImmunoChip cDNA microarray with targeted probe selection and genome‐wide Affymetrix U133 Plus 2.0 oligonucleotide array, enabled monitoring of autoimmune activity. Novel disease related genes and general autoimmune signatures were identified. Notably, down‐regulation of the HLA class Ib molecules in peripheral blood was associated with disease activity in both MS and T1D. Taken together, these studies demonstrate the potential of peripheral blood transcriptional profiling in biomedical research and diagnostics. Imbalances in peripheral blood transcriptional activity may reveal dynamic changes that are relevant for the disease but might be completely missed in conventional cross‐sectional studies.Geenien ilmentyminen ihmisen kudos‐spesifisissä autoimmuunisairauksissa Kohdekudosten hankala saatavuus on rajoittanut kudos‐spesifisten autoimmuunisairauksien tutkimusta. Immuunijärjestelmää voidaan kuitenkin tarkastella myös potilaan verestä, joka toimii immuunijärjestelmän solujen tärkeimpänä kuljetusreittinä. Käyttämällä erityisesti RNA‐molekyylien säilyttämiseksi tarkoitettuja näytteenottoputkia, voidaan tarkastella geenien ilmentymistä elimistössä näytteenottohetkellä ja siten seurata immuunijärjestelmän aktiivisuutta. Tämän väitöskirjatyön tavoitteena oli tarkastella DNA‐mikrosirujen avulla geenien ilmentymistä potilaiden veressä immuunijärjestelmän aktiivisuuden muuttuessa. Tätä tarkoitusta varten suunniteltiin ja valmistettiin keskeiset immuunijärjestelmän geenit sisältävä cDNA‐mikrosiru, jota käytettiin raskauden aikaansaaman immuunivasteen heikkenemisen tarkasteluun MS‐potilailla. Tutkimusta varten optimoitiin verinäytteiden keruu‐ ja RNA‐eristysmenetelmät, ja koska verinäytteiden RNA‐määrät olivat pieniä, eristetty RNA monistettiin ennen analysointia DNAmikrosiruilla. Samaa näytteenkäsittelymenetelmää käytettiin myös kerättäessä näytesarjoja lapsista, joilla oli jo havaittu tyypin 1 diabetekseen yhdistettyjä autovasta‐aineita. Näytesarjat lapsista, jotka myöhemmin sairastuivat tyypin 1 diabetekseen, analysoitiin kaupallisella koko genomin kattavalla sirulla. Tutkimuksissa löydettiin aikaisemmin autoimmuunijärjestelmään yhdistettyjen geenien lisäksi uusia löydöksiä sekä itse suunniteltua ja valmistettua ImmunoChip cDNA‐mikrosirua että koko genomin kattavaa Affymetrix U133 Plus 2.0 oligonukleotidisirua käytettäessä. Erityisen merkillepantavaa oli luokan 1b HLA geenien hiljeneminen sekä MS‐taudin että tyypin 1 diabeteksen aktiivisuuden lisääntyessä. Väitöskirjatyön tutkimukset osoittivat, että immuunijärjestelmän aktiivisuutta voidaan seurata potilaiden verinäytteissä ilmenevien geenien kautta, ja veren soluissa ilmenevien geenien tarkastelua voidaan hyödyntää biolääketieteen tutkimuksessa ja diagnostiikassa. Lisäksi, geenien ilmentymisen seuraaminen saman potilaan peräkkäisissä näytteissä voi paljastaa toiminnallisia muutoksia, jotka perinteisessä poikkileikkaustutkimuksessa saattaisivat jäädä kokonaan huomioimattaSiirretty Doriast

    Gene expression profiling in acute myeloid leukemia

    Get PDF
    corecore