Search CORE

25 research outputs found

Hybridization biases of microarray expression data - A model-based analysis of RNA quality and sequence effects

Author: Fasold Mario
Publication venue
Publication date: 06/11/2013
Field of study

Modern high-throughput technologies like DNA microarrays are powerful tools that are widely used in biomedical research. They target a variety of genomics applications ranging from gene expression profiling over DNA genotyping to gene regulation studies. However, the recent discovery of false positives among prominent research findings indicates a lack of awareness or understanding of the non-biological factors negatively affecting the accuracy of data produced using these technologies. The aim of this thesis is to study the origins, effects and potential correction methods for selected methodical biases in microarray data. The two-species Langmuir model serves as the basal physicochemical model of microarray hybridization describing the fluorescence signal response of oligonucleotide probes. The so-called hook method allows to estimate essential model parameters and to compute summary parameters characterizing a particular microarray sample. We show that this method can be applied successfully to various types of microarrays which share the same basic mechanism of multiplexed nucleic acid hybridization. Using appropriate modifications of the model we study RNA quality and sequence effects using publicly available data from Affymetrix GeneChip expression arrays. Varying amounts of hybridized RNA result in systematic changes of raw intensity signals and appropriate indicator variables computed from these. Varying RNA quality strongly affects intensity signals of probes which are located at the 3\'' end of transcripts. We develop new methods that help assessing the RNA quality of a particular microarray sample. A new metric for determining RNA quality, the degradation index, is proposed which improves previous RNA quality metrics. Furthermore, we present a method for the correction of the 3\'' intensity bias. These functionalities have been implemented in the freely available program package AffyRNADegradation. We show that microarray probe signals are affected by sequence effects which are studied systematically using positional-dependent nearest-neighbor models. Analysis of the resulting sensitivity profiles reveals that specific sequence patterns such as runs of guanines at the solution end of the probes have a strong impact on the probe signals. The sequence effects differ for different chip- and target-types, probe types and hybridization modes. Theoretical and practical solutions for the correction of the introduced sequence bias are provided. Assessment of RNA quality and sequence biases in a representative ensemble of over 8000 available microarray samples reveals that RNA quality issues are prevalent: about 10% of the samples have critically low RNA quality. Sequence effects exhibit considerable variation within the investigated samples but have limited impact on the most common patterns in the expression space. Variations in RNA quality and quantity in contrast have a significant impact on the obtained expression measurements. These hybridization biases should be considered and controlled in every microarray experiment to ensure reliable results. Application of rigorous quality control and signal correction methods is strongly advised to avoid erroneous findings. Also, incremental refinement of physicochemical models is a promising way to improve signal calibration paralleled with the opportunity to better understand the fundamental processes in microarray hybridization

Qucosa - Publikationsserver der Universität Leipzig

G-stack modulated probe intensities on expression arrays - sequence corrections and signal calibration

Author: Binder Hans
Fasold Mario
Stadler Peter F
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The brightness of the probe spots on expression microarrays intends to measure the abundance of specific mRNA targets. Probes with runs of at least three guanines (G) in their sequence show abnormal high intensities which reflect rather probe effects than target concentrations. This G-bias requires correction prior to downstream expression analysis. Results Longer runs of three or more consecutive G along the probe sequence and in particular triple degenerated G at its solution end ((<it>GGG</it>)1-effect) are associated with exceptionally large probe intensities on GeneChip expression arrays. This intensity bias is related to non-specific hybridization and affects both perfect match and mismatch probes. The (<it>GGG</it>)1-effect tends to increase gradually for microarrays of later GeneChip generations. It was found for DNA/RNA as well as for DNA/DNA probe/target-hybridization chemistries. Amplification of sample RNA using T7-primers is associated with strong positive amplitudes of the G-bias whereas alternative amplification protocols using random primers give rise to much smaller and partly even negative amplitudes. We applied positional dependent sensitivity models to analyze the specifics of probe intensities in the context of all possible short sequence motifs of one to four adjacent nucleotides along the 25meric probe sequence. Most of the longer motifs are adequately described using a nearest-neighbor (NN) model. In contrast, runs of degenerated guanines require explicit consideration of next nearest neighbors (GGG terms). Preprocessing methods such as vsn, RMA, dChip, MAS5 and gcRMA only insufficiently remove the G-bias from data. Conclusions Positional and motif dependent sensitivity models accounts for sequence effects of oligonucleotide probe intensities. We propose a positional dependent NN+GGG hybrid model to correct the intensity bias associated with probes containing poly-G motifs. It is implemented as a single-chip based calibration algorithm for GeneChips which can be applied in a pre-correction step prior to standard preprocessing.</p

Crossref

Directory of Open Access Journals

Fraunhofer-ePrints

PubMed Central

Permanent Hosting, Archiving and Indexing of Digital Resources and Assets

DARIO: a ncRNA detection and analysis tool for next-generation sequencing experiments

Author: Blake
Bohnert
Breiman
Chan
Chang
Cole
Crosby
David Langenberger
Dohm
Friedländer
Griffiths-Jones
Habegger
Hackenberg
Hall
Hans Binder
Hansen
Harris
Haussecker
Hertel
Langenberger
Langenberger
Lanza
Lee
Lestrade
Lin
Linsen
Mario Fasold
Peter F. Stadler
R Development Core Team
Ronen
Ruby
Schattner
Shi
Smedley
Steve Hoffmann
Taft
Umbach
Washietl
Wickham
Yang
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Small non-coding RNAs (ncRNAs) such as microRNAs, snoRNAs and tRNAs are a diverse collection of molecules with several important biological functions. Current methods for high-throughput sequencing for the first time offer the opportunity to investigate the entire ncRNAome in an essentially unbiased way. However, there is a substantial need for methods that allow a convenient analysis of these overwhelmingly large data sets. Here, we present DARIO, a free web service that allows to study short read data from small RNA-seq experiments. It provides a wide range of analysis features, including quality control, read normalization, ncRNA quantification and prediction of putative ncRNA candidates. The DARIO web site can be accessed at http://dario.bioinf.uni-leipzig.de/

Permanent Hosting, Archiving and Indexing of Digital Resources and Assets

[Avian cytogenetics goes functional] Third report on chicken genes and chromosomes 2015

Author: Aken Bronwen L.
Antin Parker B.
Archibald Alan L.
Ashwell Chris
Blackshear Perry J.
Boschiero Clarissa
Brown C. Titus
Burgess Shane C.
Burt David W.
Cheng Hans H.
Chow William
Coble Derrick J.
Cooksey Amanda
Crooijmans Richard P.m.a.
Damas Joana
Davis Richard V.n.
De Koning Dirk-jan
Delany Mary E.
Derrien Thomas
Desta Takele T.
Dunn Ian C.
Dunn Matthew
Ellegren Hans
Eory Lel
Erb Ionas
Farre Marta
Fasold Mario
Fleming Damarius
Flicek Paul
Fowler Katie E.
Fresard Laure
Froman David P.
Garceau Valerie
Gardner Paul P.
Gheyas Almas A.
Griffin Darren K.
Groenen Martien A.m.
Haaf Thomas
Hanotte Olivier
Hart Alan
Hasler Julien
Hedges S. Blair
Hertel Jana
Howe Kerstin
Hubbard Allen
Hume David A.
Kaiser Pete
Kedra Darek
Kemp Stephen
Klopp Christophe
Kniel Kalmia E.
Kuo Richard
Lagarrigue Sandrine
Lamont Susan J.
Larkin Denis M.
Lawal Raman A.
Markland Sarah M.
Mccarthy Fiona
Mccormack Heather A.
Mcpherson Marla C.
Motegi Akira
Muljo Stefan A.
Munsterberg Andrea
Nag Rishi
Nanda Indrajit
Neuberger Michael
Nitsche Anne
Notredame Cedric
Noyes Harry
O''connor Rebecca
O''hare Elizabeth A.
Oler Andrew J.
Ommeh Sheila C.
Pais Helio
Persia Michael
Pitel Frederique
Preeyanon Likit
Prieto Barja Pablo
Pritchett Elizabeth M.
Rhoads Douglas D.
Robinson Charmaine M.
Romanov Michael N.
Rothschild Max
Roux Pierre-francois
Schmid Michael
Schmidt Carl J.
Schneider Alisa-sophia
Schwartz Matt
Searle Steve M.
Skinner Michael A.
Smith Craig A.
Smith Jacqueline
Stadler Peter F.
Steeves Tammy E.
Steinlein Claus
Sun Liang
Takata Minoru
Ulitsky Igor
Wang Qing
Wang Ying
Warren Wesley C.
Wood Jonathan M.d.
Wragg David
Zhou Huaijun
Publication venue: 'S. Karger AG'
Publication date: 01/01/2015
Field of study

High-density gridded libraries of large-insert clones using bacterial artificial chromosome (BAC) and other vectors are essential tools for genetic and genomic research in chicken and other avian species... Taken together, these studies demonstrate that applications of large-insert clones and BAC libraries derived from birds are, and will continue to be, effective tools to aid high-throughput and state-of-the-art genomic efforts and the important biological insight that arises from them

Crossref

Canterbury Research and Theses Environment

Open Research Online (The Open University)

Edinburgh Research Explorer

Wageningen University & Research Publications

Kent Academic Repository

University of Queensland eSpace

The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons

Author: A Amores
A Amores
A Amores
A Force
A Grimson
A Kozomara
A Louis
A Sato
A Siepel
A Stamatakis
A Stoltzfus
A Visel
A Woolfe
Aaron M Berlin
AJ Pagán
AJ Vilella
Alison P Lee
Allyse Ferrara
AM Reitzel
Andrew R Gehrke
Angel Amores
AR Gehrke
AR Quinlan
AS Hinrichs
Axel Meyer
B Langmead
B Ryll
B Venkatesh
BC Faircloth
BM Wheeler
Bronwen Aken
Byrappa Venkatesh
C Cañestro
C Holt
Chris T Amemiya
Cristian Cañestro
CT Amemiya
D Chalopin
D Duboule
D Lagman
Daniel Barrell
DJ Rennison
DL Rabosky
Domitille Chalopin
DR Scannell
Dustin Wcisel
E Berezikov
EE Patton
EJ Hagedorn
F Tajima
Federica Di Palma
Felix E G Beaudry
G Andrey
G Duester
G Kurosawa
Gary W Litman
H Dirscherl
H Dirscherl
H Li
H Li
H Mano
H Wang
Han Wang
HB Shaffer
I Braasch
I Braasch
I Harel
I Sasagawa
I Schneider
Igor Schneider
Ingo Braasch
J Bingulac-Popovic
J Hertel
J Postlethwait
J Toloza-Villalobos
J Zakany
JA Yoder
Jana Hertel
Jason Sydes
JC Roach
JD Hansen
Jean-Nicolas Volff
Jeffrey A Yoder
Jeramiah J Smith
Jeremy Johnson
Jeremy Pasquier
Jessica Alföldi
JF Mulley
JH Postlethwait
JJ Smith
JJ Smith
JM Catchen
JM Dijkstra
John F Mulley
John H Letaw
John H Postlethwait
John S Taylor
JP Bogart
JP Cannon
JS Taylor
Julian Catchen
Julien Bobe
JY Sire
K Kawasaki
K Kawasaki
K Kawasaki
K Naruse
K Wolfe
KA Frazer
Kazuhiko Kawasaki
KD Crow
Kerstin Lindblad-Toh
Kyle J Martin
L Grande
LA Pennacchio
LC Sallan
Louise Williams
M Blanchette
M Brudno
M Krzywinski
M Muffato
M Nikaido
M Zhu
Marcia Lara
Mario Fasold
Mark Yandell
Masato Mikami
MF Flajnik
MG Grabherr
MH Schulz
Michael J Beam
Michael S Campbell
Mikio Ishiyama
N Chang
N Danilova
N Lartillot
N Takezaki
Neil H Shubin
Nil Ratan Saha
O Lee
P Dehal
P Flicek
Peter Batzel
Peter F Stadler
Peter W H Holland
Q Qu
Quenton Fontenot
R Lorenz
RC Albertson
RN Kettleborough
Ronda T Litman
S Anders
S Berlivet
S Fisher
S Gnerre
S Griffiths-Jones
S Griffiths-Jones
S Griffiths-Jones
S Hoegg
S Ohno
Shaohua Fan
SR Frankenberg
Steffi Kehr
Stephen M J Searle
T Desvignes
T Desvignes
T Montavon
T Wicker
Tatsuya Ota
Tereza Manousaki
Tetsuya Nakamura
Thomas Desvignes
TJ Near
U Grimholt
V Ravi
Vydianathan Ravi
WL Long
WY Hwang
X He
Y Nakatani
YA Zhang
Yann Guiguen
YH Loh
Yi Sun
ZE Parra
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences

Cold Spring Harbor Laboratory Institutional Repository

HAL Descartes

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Hal-Diderot

HAL-ENS-LYON

KOPS - The Institutional Repository of the University of Konstanz

Crossref

Publikationer från Uppsala Universitet

PubMed Central

eScholarship - University of California

University of East Anglia digital repository

Diposit Digital de la Universitat de Barcelona

Bangor University Research Portal

Introducing evolutionary biologists to the analysis of big data: guidelines to organize extended bioinformatics training courses

Author: A Via
A Via
AC Greene
Alvaro Perdomo-Sabogal
Bert Overduin
CA Goldsmid
Christoph Bleidorn
Clara Isabel Bermudez Santana
Consortium GOBLET
D Ebert-May
D Udovic
David Langenberger
Deborah Triant
Giovanni Marco Dall’Olio
GT Doran
H Hinaux
Henrike Indrischek
J Handelsman
Jan Aerts
Jan Engelhardt
JB Losos
Johannes Engelken
Katja Liebal
Katja Nowick
KD Kendall
M Corpas
Mario Fasold
MD Brazas
MV Schneider
MV Schneider
R Kwok
Rui Faria
Sofia Robb
Sonja Grath
SP Carroll
Sree Rohit Raj Kolora
Tiago Carvalho
TK Attwood
TR Meagher
V Marx
Vladimir Jovanovic
Walter Salzburger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Research in evolutionary biology has been progressively influenced by big data such as massive genome and transcriptome sequencing data, scalar measurements of several phenotypes on tens to thousands of individuals, as well as from collecting worldwide environmental data at an increasingly detailed scale. The handling and analysis of such data require computational skills that usually exceed the abilities of most traditionally trained evolutionary biologists. Here we discuss the advantages, challenges and considerations for organizing and running bioinformatics training courses of 2–3 weeks in length to introduce evolutionary biologists to the computational analysis of big data. Extended courses have the advantage of offering trainees the opportunity to learn a more comprehensive set of complementary topics and skills and allowing for more time to practice newly acquired competences. Many organizational aspects are common to any course, as the need to define precise learning objectives and the selection of appropriate and highly motivated instructors and trainees, among others. However, other features assume particular importance in extended bioinformatics training courses. To successfully implement a learning-by-doing philosophy, sufficient and enthusiastic teaching assistants (TAs) are necessary to offer prompt help to trainees. Further, a good balance between theoretical background and practice time needs to be provided and assured that the schedule includes enough flexibility for extra review sessions or further discussions if desired. A final project enables trainees to apply their newly learned skills to real data or case studies of their interest. To promote a friendly atmosphere throughout the course and to build a close-knit community after the course, allow time for some scientific discussions and social activities. In addition, to not exhaust trainees and TAs, some leisure time needs to be organized. Finally, all organization should be done while keeping the budget within fair limits. In order to create a sustainable course that constantly improves and adapts to the trainees’ needs, gathering short- and long-term feedback after the end of the course is important. Based on our experience we have collected a set of recommendations to effectively organize and run extended bioinformatics training courses for evolutionary biologists, which we here want to share with the community. They offer a complementary way for the practical teaching of modern evolutionary biology and reaching out to the biological community.Peer reviewe

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

Digital.CSIC

MPG.PuRe

Variation of RNA Quality and Quantity Are Major Sources of Batch Effects in Microarray Expression Data

Author: Mario Fasold
Publication venue: 'MDPI AG'
Publication date: 16/12/2014
Field of study

The great utility of microarrays for genome-scale expression analysis is challenged by the widespread presence of batch effects, which bias expression measurements in particular within large data sets. These unwanted technical artifacts can obscure biological variation and thus significantly reduce the reliability of the analysis results. It is largely unknown which are the predominant technical sources leading to batch effects. We here quantitatively assess the prevalence and impact of several known technical effects on microarray expression results. Particularly, we focus on important factors such as RNA degradation, RNA quantity, and sequence biases including multiple guanine effects. We find that the common variation of RNA quality and RNA quantity can not only yield low-quality expression results, but that both factors also correlate with batch effects and biological characteristics of the samples

Multidisciplinary Digital Publishing Institute

Hybridization biases of microarray expression data - A model-based analysis of RNA quality and sequence effects

Author: Fasold Mario
Publication venue
Publication date: 06/11/2013
Field of study

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig