Search CORE

7 research outputs found

OligoRAP – an Oligo Re-Annotation Pipeline to improve annotation and estimate target specificity

Author: Breit Timo M
Groenen Martien AM
Leunissen Jack AM
Neerincx Pieter BT
Nie Haisheng
Rauwerda Han
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background - High throughput gene expression studies using oligonucleotide microarrays depend on the specificity of each oligonucleotide (oligo or probe) for its target gene. However, target specific probes can only be designed when a reference genome of the species at hand were completely sequenced, when this genome were completely annotated and when the genetic variation of the sampled individuals were completely known. Unfortunately there is not a single species for which such a complete data set is available. Therefore, it is important that probe annotation can be updated frequently for optimal interpretation of microarray experiments. Results - In this paper we present OligoRAP, a pipeline to automatically update the annotation of oligo libraries and estimate oligo target specificity. OligoRAP uses a reference genome assembly with Ensembl and Entrez Gene annotation supplemented with a set of unmapped transcripts derived from RefSeq and UniGene to handle assembly gaps. OligoRAP produces alignments of each oligo with the reference assembly as well as with unmapped transcripts. These alignments are re-mapped to the annotation sources, which results in a concise, as complete as possible and up-to-date annotation of the oligo library. The building blocks of this pipeline are BioMoby web services creating a highly modular and distributed system with a robust, remote programmatic interface. OligoRAP was used to update the annotation for a subset of 791 oligos from the ARK-Genomics 20 K chicken array, which were selected as starting material for the oligo annotation session of the EADGENE/SABRE Post-analysis workshop. Based on the updated annotation about one third of these oligos is problematic with regard to target specificity. In addition, the accession numbers or ids the oligos were originally designed for no longer exist in the updated annotation for almost half of the oligos. Conclusion - As microarrays are designed on incomplete data, it is important to update probe annotation and check target specificity regularly. OligoRAP provides both and due to its design based on BioMoby web services it can easily be embedded as an oligo annotation engine in customised applications for microarray data analysis. The dramatic difference in updated annotation and target specificity for the ARK-Genomics 20 K chicken array as compared to the original data emphasises the need for regular updates

Crossref

Springer - Publisher Connector

PubMed Central

Edinburgh Research Explorer

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Microarray data mining using Bioconductor packages

Author: Bicciato Silvio
Ferrari Francesco
Groenen Martien AM
Leunissen Jack AM
Neerincx Pieter BT
Nie Haisheng
Poel Jan van der
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

This article is available from

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Wageningen University & Research Publications

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Comparison of three microarray probe annotation pipelines: differences in strategies and their effect on downstream analysis

Author: Casel Pierrot
Groenen Martien AM
Klopp Christophe
Leunissen Jack AM
Neerincx Pieter BT
Nie Haisheng
Prickett Dennis
Watson Michael
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background - Reliable annotation linking oligonucleotide probes to target genes is essential for functional biological analysis of microarray experiments. We used the IMAD, OligoRAP and sigReannot pipelines to update the annotation for the ARK-Genomics Chicken 20 K array as part of a joined EADGENE/SABRE workshop. In this manuscript we compare their annotation strategies and results. Furthermore, we analyse the effect of differences in updated annotation on functional analysis for an experiment involving Eimeria infected chickens and finally we propose guidelines for optimal annotation strategies. Results - IMAD, OligoRAP and sigReannot update both annotation and estimated target specificity. The 3 pipelines can assign oligos to target specificity categories although with varying degrees of resolution. Target specificity is judged based on the amount and type of oligo versus target-gene alignments (hits), which are determined by filter thresholds that users can adjust based on their experimental conditions. Linking oligos to annotation on the other hand is based on rigid rules, which differ between pipelines. For 52.7% of the oligos from a subset selected for in depth comparison all pipelines linked to one or more Ensembl genes with consensus on 44.0%. In 31.0% of the cases none of the pipelines could assign an Ensembl gene to an oligo and for the remaining 16.3% the coverage differed between pipelines. Differences in updated annotation were mainly due to different thresholds for hybridisation potential filtering of oligo versus target-gene alignments and different policies for expanding annotation using indirect links. The differences in updated annotation packages had a significant effect on GO term enrichment analysis with consensus on only 67.2% of the enriched terms. Conclusion - In addition to flexible thresholds to determine target specificity, annotation tools should provide metadata describing the relationships between oligos and the annotation assigned to them. These relationships can then be used to judge the varying degrees of reliability allowing users to fine-tune the balance between reliability and coverage. This is important as it can have a significant effect on functional microarray analysis as exemplified by the lack of consensus on almost one third of the terms found with GO term enrichment analysis based on updated IMAD, OligoRAP or sigReannot annotatio

Crossref

Springer

Springer - Publisher Connector

PubMed Central

Edinburgh Research Explorer

Wageningen University & Research Publications

Using R in Taverna: RShell v1.2

Author: Breit Timo M
Leunissen Jack AM
Neerincx Pieter BT
Nijholt Anton
Rauwerda Han
Vet Paul E van der
Wassink Ingo
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: R is the statistical language commonly used by many life scientists in (omics) data by the open source workflow management system Taverna. However, Taverna had limited support for R, because it supported just a few data types and only a single output. Also, there was no support for graphical output and persistent sessions. Altogether this made using R in Taverna impractical.\ud \ud Findings: We have developed an R plugin for Taverna: RShell, which provides R functionality within workflows designed in Taverna. In order to fully support the R language, our RShell plugin directly uses the R interpreter. The RShell plugin consists of a Taverna processor for R scripts and an RShell Session Manager that communicates with the R server. We made the RShell processor highly configurable allowing the user to define multiple inputs and outputs. Also, various data types are supported, such as strings, numeric data and images. To limit data transport between multiple RShell processors, the RShell plugin also supports persistent sessions. Here, we will describe the architecture of RShell and the new features that are introduced in version 1.2, i.e.: i) Support for R up to and including R version 2.9; ii) Support for persistent sessions to limit data transfer; iii) Support for vector graphics output through PDF; iv) Syntax highlighting of the R code; v) Improved usability through fewer port types. Our new RShell processor is backwards compatible with workflows that use older versions of the RShell processor. We demonstrate the value of the RShell processor by a use-case workflow that maps oligonucleotide probes designed with DNA sequence information from Vega onto the Ensembl genome assembly.\ud \ud Conclusion: Our RShell plugin enables Taverna users to employ R scripts within their workflows in a highly configurable way

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Twente Research Information

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Methods for interpreting lists of affected genes obtained in a DNA microarray experiment

Author: A Alexa
A Bonnet
A Jiménez-Marín
A Skarman
Agnès Bonnet
Arun Kommadath
Axel Skarman
B Zhang
Bart Buitenhuis
Christèle Robert-Granié
Cristina Arce
D Prickett
D Prickett
Dennis Prickett
DJ de Koning
dW Huang
F Jaffrezic
Francesco Ferrari
GK Smyth
Gwenola Tosser-Klopp
H Nie
Haisheng Nie
Henrik Hornshøj
I Hulsegge
Ina Hulsegge
Jack AM Leunissen
Jakob Hedegaard
Jan van der Poel
JJ Goeman
JJ Goeman
Johanna MJ Rebel
Juan J Garrido
KD Dahlquist
KH Pan
Laurence Liaubet
Lene N Conley
Li Jiang
M Ashburner
M Kanehisa
M Watson
Magali SanCristobal
Mari A Smits
Martien AM Groenen
María Ramirez-Boo
Melania Collado-Romero
Michael Watson
N Salomonis
P Casel
P Sorensen
PBT Neerincx
PBT Neerincx
Peter Sørensen
Pieter BT Neerincx
Q Liu
Q Zheng
S Falcon
S Song
Sandrine Lagarrigue
Silvio Bicciato
SW Doniger
Ángeles Jiménez-Marín
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

BACKGROUND: The aim of this paper was to describe and compare the methods used and the results obtained by the participants in a joint EADGENE (European Animal Disease Genomic Network of Excellence) and SABRE (Cutting Edge Genomics for Sustainable Animal Breeding) workshop focusing on post analysis of microarray data. The participating groups were provided with identical lists of microarray probes, including test statistics for three different contrasts, and the normalised log-ratios for each array, to be used as the starting point for interpreting the affected probes. The data originated from a microarray experiment conducted to study the host reactions in broilers occurring shortly after a secondary challenge with either a homologous or heterologous species of Eimeria. RESULTS: Several conceptually different analytical approaches, using both commercial and public available software, were applied by the participating groups. The following tools were used: Ingenuity Pathway Analysis, MAPPFinder, LIMMA, GOstats, GOEAST, GOTM, Globaltest, TopGO, ArrayUnlock, Pathway Studio, GIST and AnnotationDbi. The main focus of the approaches was to utilise the relation between probes/genes and their gene ontology and pathways to interpret the affected probes/genes. The lack of a well-annotated chicken genome did though limit the possibilities to fully explore the tools. The main results from these analyses showed that the biological interpretation is highly dependent on the statistical method used but that some common biological conclusions could be reached. CONCLUSION: It is highly recommended to test different analytical methods on the same data set and compare the results to obtain a reliable biological interpretation of the affected genes in a DNA microarray experimen

Repositorio Institucional de la Universidad de Córdoba

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HAL Descartes

Edinburgh Research Explorer

ProdInra

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Hal-Diderot

Recommended from our members

A framework for the detection of de novo mutations in family-based sequencing data

Author: Abdellaoui Abdel
Amin Najaf
Banks Eric
Beekman Marian B
Boomsma Dorret I
Bot Jan
Bovenberg Jasper A
Brandsma Margreet
Byelas Heorhiy
Cao Hongzhi
Cao Sujie
Chen Ruoyan
Cox David R
Cretu-Stancu Mircea
Daly Mark J
de Bakker Paul IW
de Craen Anton JM
de Knijff Peter
Deelen Patrick
den Dunnen Johan T
DePristo Mark A
Dijkstra Martijn
Du Yuanping
Elbers Clara C
Estrada Karol
Francesco Palamara Pier
Francioli Laurent C
Fromer Menachem
Garimella Kiran V
Guryev Victor
Handsaker Robert E
Hehir-Kwa Jayne Y
Hofman Albert
Hormozdiari Fereydoun
Hottenga Jouke Jan
Investigator Principal
Isaacs Aaron
Kanterakis Alexandros
Karssen Lennart C
Kattenberg Mathijs
Kayser Manfred
Kloosterman Wigard P
Koval Vyacheslav
Lameijer Eric-Wubbo
Laros Jeroen FJ
Li Mingkun
Li Ning
Li Qibin
Li Yingrui
Marschall Tobias
McCarroll Steven A
Medina-Gomez Carolina
Mei Hailiang
Menelaou Androniki
Moed Matthijs H
Neale Benjamin M
Neerincx Pieter BT
Nijman Isaäc J
Oostra Ben
Pe'er Itsik
Pitts Steven J
Platteel Mathieu
Polak Paz
Potluri Shobha
Pulit Sara L
Renkens Ivo
Rivadeneira Fernando
Samocha Kaitlin E
Schönhuth Alexander
Slagboom P Eline
Slagboom PEline
Sohail Mashaal
Stoneking Mark
Suchiman H Eka D
Sundar Purnima
Sunyaev Shamil R
Swertz Morris A
Uitterlinden André G
van den Berg Leonard H
van der Velde K Joeri
van Dijk Freerk
van Duijn Cornelia M
van Enckevort David
van Leeuwen Elisabeth M
van Ommen Gertjan B
van Oven Mannis
van Schaik Barbera DC
van Setten Jessica
Veldink Jan H
Vermaat Martijn
Vuzman Dana
Wang Jun
Wijmenga Cisca
Willemsen Gonneke
Ye Kai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/03/2017
Field of study

Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports

Harvard University - DASH