Search CORE

53 research outputs found

New methods for next generation sequencing based microRNA expression profiling

Author: 't Hoen Peter AC
Ariyurek Yavuz
Buermans Henk PJ
den Dunnen Johan T
van Ommen Gertjan
Publication venue: BioMed Central
Publication date: 01/12/2010
Field of study

Abstract Background MicroRNAs are small non-coding RNA transcripts that regulate post-transcriptional gene expression. The millions of short sequence reads generated by next generation sequencing technologies make this technique explicitly suitable for profiling of known and novel microRNAs. A modification to the small-RNA expression kit (SREK, Ambion) library preparation method for the SOLiD sequencing platform is described to generate microRNA sequencing libraries that are compatible with the Illumina Genome Analyzer. Results High quality sequencing libraries can successfully be prepared from as little as 100 ng small RNA enriched RNA. An easy to use perl-based analysis pipeline called E-miR was developed to handle the sequencing data in several automated steps including data format conversion, 3' adapter removal, genome alignment and annotation to non-coding RNA transcripts. The sample preparation and E-miR pipeline were used to identify 37 cardiac enriched microRNAs in stage 16 chicken embryos. Isomir expression profiles between the heart and embryo were highly correlated for all miRNAs suggesting that tissue or cell specific miRNA modifications do not occur. Conclusions In conclusion, our alternative sample preparation method can successfully be applied to generate high quality miRNA sequencing libraries for the Illumina genome analyzer.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Leiden University Scholary Publications

CORE_TF: a user-friendly interface to identify evolutionary conserved transcription factor binding sites in sets of co-regulated genes

Author: 't Hoen Peter AC
den Dunnen Johan T
Hestand Matthew S
van Galen Michiel
van Ommen Gert-Jan B
Villerius Michel P
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The identification of transcription factor binding sites is difficult since they are only a small number of nucleotides in size, resulting in large numbers of false positives and false negatives in current approaches. Computational methods to reduce false positives are to look for over-representation of transcription factor binding sites in a set of similarly regulated promoters or to look for conservation in orthologous promoter alignments. Results We have developed a novel tool, "CORE_TF" (Conserved and Over-REpresented Transcription Factor binding sites) that identifies common transcription factor binding sites in promoters of co-regulated genes. To improve upon existing binding site predictions, the tool searches for position weight matrices from the TRANSFAC<it>R </it>database that are over-represented in an experimental set compared to a random set of promoters and identifies cross-species conservation of the predicted transcription factor binding sites. The algorithm has been evaluated with expression and chromatin-immunoprecipitation on microarray data. We also implement and demonstrate the importance of matching the random set of promoters to the experimental promoters by GC content, which is a unique feature of our tool. Conclusion The program CORE_TF is accessible in a user friendly web interface at <url>http://www.LGTC.nl/CORE_TF</url>. It provides a table of over-represented transcription factor binding sites in the users input genes' promoters and a graphical view of evolutionary conserved transcription factor binding sites. In our test data sets it successfully predicts target transcription factors and their binding sites.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease

Author: 't Hoen Peter AC
den Dunnen Johan T
Jelier Rob
Kors Jan A
Mons Barend
Sterrenburg Ellen
van Ommen Gert-Jan B
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: Comparative analysis of expression microarray studies is difficult due to the large influence of technical factors on experimental outcome. Still, the identified differentially expressed genes may hint at the same biological processes. However, manually curated assignment of genes to biological processes, such as pursued by the Gene Ontology (GO) consortium, is incomplete and limited. We hypothesised that automatic association of genes with biological processes through thesaurus-controlled mining of Medline abstracts would be more effective. Therefore, we developed a novel algorithm (LAMA: Literature-Aided Meta-Analysis) to quantify the similarity between transcriptomics studies. We evaluated our algorithm on a large compendium of 102 microarray studies published in the field of muscle development and disease, and compared it to similarity measures based on gene overlap and over-representation of biological processes assigned by GO. Results: While the overlap in both genes and overrepresented GO-terms was poor, LAMA retrieved many more biologically meaningful links between studies, with substantially lower influence of technical factors. LAMA correctly grouped muscular dystrophy, regeneration and myositis studies, and linked patient and corresponding mouse model studies. LAMA also retrieves the connecting biological concepts. Among other new discoveries, we associated cullin proteins, a class of ubiquitinylation proteins, with genes down-regulated during muscle regeneration, whereas ubiquitinylation was previously reported to be activated during the inverse process: muscle atrophy. Conclusion: Our literature-based association analysis is capable of finding hidden common biological denominators in microarray studies, and circumvents the need for raw data analysis or curated gene annotation databases

Lirias

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

EUR Research Repository

Erasmus University Digital Repository

Determining the quality and complexity of next-generation sequencing data without a reference genome

Author: Irina Pulyakhina
Jeroen FJ Laros
Johan T den Dunnen
Ken Kraaijeveld
Lusine Khachatryan
Martijn Vermaat
Michiel van Galen
Peter AC ’t Hoen
Peter de Knijff
Seyed Yahya Anvar
Yavuz Ariyurek
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

Genomics, epigenetics, population genetics and bioinformatic

VU Research Portal

Crossref

Springer - Publisher Connector

PubMed Central

Leiden University Scholary Publications

Mutant huntingtin activates Nrf2-responsive genes and impairs dopamine synthesis in a PC12 model of Huntington's disease

Author: 't Hoen Peter AC
den Dunnen Johan T
Dorsman Josephine C
Pepers Barry A
van Ommen GertJan B
van Roon-Mom Willeke MC
Verwijmeren Carola ACM
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Huntington's disease is a progressive autosomal dominant neurodegenerative disorder that is caused by a CAG repeat expansion in the HD or Huntington's disease gene. Although micro array studies on patient and animal tissue provide valuable information, the primary effect of mutant huntingtin will inevitably be masked by secondary processes in advanced stages of the disease. Thus, cell models are instrumental to study early, direct effects of mutant huntingtin. mRNA changes were studied in an inducible PC12 model of Huntington's disease, before and after aggregates became visible, to identify groups of genes that could play a role in the early pathology of Huntington's disease. Results Before aggregation, up-regulation of gene expression predominated, while after aggregates became visible, down-regulation and up-regulation occurred to the same extent. After aggregates became visible there was a down-regulation of dopamine biosynthesis genes accompanied by down-regulation of dopamine levels in culture, indicating the utility of this model to identify functionally relevant pathways. Furthermore, genes of the anti-oxidant Nrf2-ARE pathway were up-regulated, possibly as a protective mechanism. In parallel, we discovered alterations in genes which may result in increased oxidative stress and damage. Conclusion Up-regulation of gene expression may be more important in HD pathology than previously appreciated. In addition, given the pathogenic impact of oxidative stress and neuroinflammation, the Nrf2-ARE signaling pathway constitutes a new attractive therapeutic target for HD.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Can subtle changes in gene expression be consistently detected with different microarray platforms?

Author: 't Hoen Peter AC
Ariyurek Yavuz
Boer Judith M
de Hollander Mattias
de Menezes Renée X
den Dunnen Johan T
Kuiper Rowan
Pedotti Paola
Schenk Geert J
van Ommen Gertjan JB
Vossen Rolf HAM
Vreugdenhil Erno
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: The comparability of gene expression data generated with different microarray platforms is still a matter of concern. Here we address the performance and the overlap in the detection of differentially expressed genes for five different microarray platforms in a challenging biological context where differences in gene expression are few and subtle. Results: Gene expression profiles in the hippocampus of five wild-type and five transgenic δC-doublecortin-like kinase mice were evaluated with five microarray platforms: Applied Biosystems, Affymetrix, Agilent, Illumina, LGTC home-spotted arrays. Using a fixed false discovery rate of 10% we detected surprising differences between the number of differentially expressed genes per platform. Four genes were selected by ABI, 130 by Affymetrix, 3,051 by Agilent, 54 by Illumina, and 13 by LGTC. Two genes were found significantly differentially expressed by all platforms and the four genes identified by the ABI platform were found by at least three other platforms. Quantitative RT-PCR analysis confirmed 20 out of 28 of the genes detected by two or more platforms and 8 out of 15 of the genes detected by Agilent only. We observed improved correlations between platforms when ranking the genes based on the significance level than with a fixed statistical cut-off. We demonstrate significant overlap in the affected gene sets identified by the different platforms, although biological processes were represented by only partially overlapping sets of genes. Aberrances in GABA-ergic signalling in the transgenic mice were consistently found by all platforms. Conclusion: The different microarray platforms give partially complementary views on biological processes affected. Our data indicate that when analyzing samples with only subtle differences in gene expression the use of two different platforms might be more attractive than increasing the number of replicates. Commercial two-color platforms seem to have higher power for finding differentially expressed genes between groups with small differences in expression

Crossref

AIR Universita degli studi di Milano

Springer - Publisher Connector

PubMed Central

Erasmus University Digital Repository

The identification of informative genes from multiple datasets with increasing complexity

Author: AH Fielding
Allan Tucker
BC Haynes
C Zhang
D Grossman
D Heckerman
D Madigan
DM Chickering
DR Rhodes
E Segal
G Schwarz
H Ma
J Bockhorst
J Pearl
J Su
JB Tobler
JM Peña
KK Tomczak
KP Murphy
M Miron
M Stone
N Friedman
N Friedman
N Friedman
Peter AC 't Hoen
R Jelier
R Kohavi
R Mac Nally
RA Irizarry
S Iezzi
S Yahya Anvar
SS Shen-Orr
TI Lee
TVan den Bulcke
W Lam
WL Buntine
X Xu
Y Cao
Y Lai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background In microarray data analysis, factors such as data quality, biological variation, and the increasingly multi-layered nature of more complex biological systems complicates the modelling of regulatory networks that can represent and capture the interactions among genes. We believe that the use of multiple datasets derived from related biological systems leads to more robust models. Therefore, we developed a novel framework for modelling regulatory networks that involves training and evaluation on independent datasets. Our approach includes the following steps: (1) ordering the datasets based on their level of noise and informativeness; (2) selection of a Bayesian classifier with an appropriate level of complexity by evaluation of predictive performance on independent data sets; (3) comparing the different gene selections and the influence of increasing the model complexity; (4) functional analysis of the informative genes. Results In this paper, we identify the most appropriate model complexity using cross-validation and independent test set validation for predicting gene expression in three published datasets related to myogenesis and muscle differentiation. Furthermore, we demonstrate that models trained on simpler datasets can be used to identify interactions among genes and select the most informative. We also show that these models can explain the myogenesis-related genes (genes of interest) significantly better than others (P < 0.004) since the improvement in their rankings is much more pronounced. Finally, after further evaluating our results on synthetic datasets, we show that our approach outperforms a concordance method by Lai et al. in identifying informative genes from multiple datasets with increasing complexity whilst additionally modelling the interaction between genes. Conclusions We show that Bayesian networks derived from simpler controlled systems have better performance than those trained on datasets from more complex biological systems. Further, we present that highly predictive and consistent genes, from the pool of differentially expressed genes, across independent datasets are more likely to be fundamentally involved in the biological process under study. We conclude that networks trained on simpler controlled systems, such as in vitro experiments, can be used to model and capture interactions among genes in more complex datasets, such as in vivo experiments, where these interactions would otherwise be concealed by a multitude of other ongoing events

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Leiden University Scholary Publications

Brunel University Research Archive

Recommended from our members

Transcriptome and genome sequencing uncovers functional variation in humans

Author: Almlöf Jonas
Amstislavskiy Vyacheslav
Antonarakis Stylianos E
Barann Matthias
Beltran Sergi
Bertier Gabrielle
Brazma Alvis
Buermans Henk PJ
Carracedo Ángel
Dermitzakis Emmanouil T
Donnelly Peter
Esser Daniela
Estivill Xavier
Ferreira Pedro G
Flicek Paul
Friedländer Marc R
Giger Thomas
Gonzàlez-Porta Mar
Greger Liliana
Griebel Thasso
Guigó Roderic
Gut Ivo G
Gut Marta
Häsler Robert
Kahlem Katja
Karlberg Olof
Kilpinen Helena
Kurbatova Natalja
Lappalainen Tuuli
Lehrach Hans
Lek Monkol
Lizano Esther
MacArthur Daniel G
McCarthy Mark I
Meitinger Thomas
Monlong Jean
Montgomery Stephen B
Ongen Halit
Padioleau Ismael
Pirinen Matti
Pulyakhina Irina
Ribeca Paolo
Rivas Manuel A
Rosenstiel Philip
Sammeth Michael
Schreiber Stefan
Schwarzmayr Thomas
Stegle Oliver
Strom Tim M
Sudbrak Ralf
Sultan Marc
Syvänen Ann-Christine
Tikhonov Andrew
van Iterson Maarten
van Ommen Gert-Jan
Wieland Thomas
‘t Hoen Peter AC
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/04/2014
Field of study

Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome

Harvard University - DASH

Calpain 3 Is a Rapid-Action, Unidirectional Proteolytic Switch Central to Muscle Remodeling

Calpain 3 (CAPN3) is a cysteine protease that when mutated causes Limb Girdle Muscular Dystrophy 2A. It is thereby the only described Calpain family member that genetically causes a disease. Due to its inherent instability little is known of its substrates or its mechanism of activity and pathogenicity. In this investigation we define a primary sequence motif underlying CAPN3 substrate cleavage. This motif can transform non-related proteins into substrates, and identifies >300 new putative CAPN3 targets. Bioinformatic analyses of these targets demonstrate a critical role in muscle cytoskeletal remodeling and identify novel CAPN3 functions. Among the new CAPN3 substrates are three E3 SUMO ligases of the Protein Inhibitor of Activated Stats (PIAS) family. CAPN3 can cleave PIAS proteins and negatively regulates PIAS3 sumoylase activity. Consequently, SUMO2 is deregulated in patient muscle tissue. Our study thus uncovers unexpected crosstalk between CAPN3 proteolysis and protein sumoylation, with strong implications for muscle remodeling

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central