Search CORE

9 research outputs found

Modeling SAGE tag formation and its effects on data interpretation within a Bayesian framework

Author: A Beyer
A Gelman
EH Hurowitz
HH Thygesen
Hong Qin
J Colinge
JS Morris
K Dolinski
KA Baggerly
L Cai
L David
L Zhang
M Harbers
MD Stern
Michael A Gilchrist
Russell Zaretzki
RZN Vencio
RZN Vencio
S Audic
SL Madden
T Beissbarth
VA Kuznetsov
VE Velculescu
VE Velculescu
VR Akmaev
Wolfram Research Inc
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Serial Analysis of Gene Expression (SAGE) is a high-throughput method for inferring mRNA expression levels from the experimentally generated sequence based tags. Standard analyses of SAGE data, however, ignore the fact that the probability of generating an observable tag varies across genes and between experiments. As a consequence, these analyses result in biased estimators and posterior probability intervals for gene expression levels in the transcriptome. Results Using the yeast <it>Saccharomyces cerevisiae </it>as an example, we introduce a new Bayesian method of data analysis which is based on a model of SAGE tag formation. Our approach incorporates the variation in the probability of tag formation into the interpretation of SAGE data and allows us to derive exact joint and approximate marginal posterior distributions for the mRNA frequency of genes detectable using SAGE. Our analysis of these distributions indicates that the frequency of a gene in the tag pool is influenced by its mRNA frequency, the cleavage efficiency of the anchoring enzyme (AE), and the number of informative and uninformative AE cleavage sites within its mRNA. Conclusion With a mechanistic, model based approach for SAGE data analysis, we find that inter-genic variation in SAGE tag formation is large. However, this variation can be estimated and, importantly, accounted for using the methods we develop here. As a result, SAGE based estimates of mRNA frequencies can be adjusted to remove the bias introduced by the SAGE tag formation process.</p

University of Tennessee, Knoxville: Trace

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Bias correction and Bayesian analysis of aggregate counts in SAGE libraries

Author: A Gelman
Artin Armagan
C Romualdi
DV Lindley
E Pauws
H Jiang
H Matsumura
H Matsumura
HH Thygesen
J Lu
JS Morris
K Boon
KA Baggerly
KA Baggerly
MH Chen
Michael A Gilchrist
PAC 't Hoen
R Malig
Russell L Zaretzki
RZN Vencio
RZN Vencio
SF Arnold
VA Kuznetsov
VA Kuznetsov
VE Velculescu
VE Velculescu
William M Briggs
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Tag-based techniques, such as SAGE, are commonly used to sample the mRNA pool of an organism's transcriptome. Incomplete digestion during the tag formation process may allow for multiple tags to be generated from a given mRNA transcript. The probability of forming a tag varies with its relative location. As a result, the observed tag counts represent a biased sample of the actual transcript pool. In SAGE this bias can be avoided by ignoring all but the 3' most tag but will discard a large fraction of the observed data. Taking this bias into account should allow more of the available data to be used leading to increased statistical power. Results Three new hierarchical models, which directly embed a model for the variation in tag formation probability, are proposed and their associated Bayesian inference algorithms are developed. These models may be applied to libraries at both the tag and aggregate level. Simulation experiments and analysis of real data are used to contrast the accuracy of the various methods. The consequences of tag formation bias are discussed in the context of testing differential expression. A description is given as to how these algorithms can be applied in that context. Conclusions Several Bayesian inference algorithms that account for tag formation effects are compared with the DPB algorithm providing clear evidence of superior performance. The accuracy of inferences when using a particular non-informative prior is found to depend on the expression level of a given gene. The multivariate nature of the approach easily allows both univariate and joint tests of differential expression. Calculations demonstrate the potential for false positive and negative findings due to variation in tag formation probabilities across samples when testing for differential expression.</p

University of Tennessee, Knoxville: Trace

Crossref

Directory of Open Access Journals

PubMed Central

DukeSpace

MediPlEx - a tool to combine in silico & experimental gene expression profiles of the model legume Medicago truncatula

Author: A Frenzel
A Schüssler
A Wulf
Alexander Goesmann
C Town
D Barker
DJ Stekel
EP Journet
G Sherlock
GED Oldroyd
H Javot
H Küster
H Parkinson
Helge Küster
J Aitchison
J Aitchison
J Chen
J Doll
J Liu
J Quackenbush
K Henckel
K Okubo
Kolja Henckel
Leonhard J Stutz
M Parniske
MJ Harrison
MN Bainbridge
N Hohnjec
N Hohnjec
N Mulder
ND Young
NJ Brewin
R Development Core Team
R Edgar
R Thompson
RP Wise
RZN Vencio
S Altschul
S Brenner
S Smith
SB Cannon
SR Eddy
T Bekel
U Grunwald
VA Benedito
VE Velculescu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Henckel K, Küster H, Stutz L, Goesmann A. MediPlEx - a tool to combine in silico and experimental gene expression profiles of the model legume Medicago truncatula. BMC Research Notes. 2010;3(1): 262.BACKGROUND:Expressed Sequence Tags (ESTs) are in general used to gain a first insight into gene activities from a species of interest. Subsequently, and typically based on a combination of EST and genome sequences, microarray-based expression analyses are performed for a variety of conditions. In some cases, a multitude of EST and microarray experiments are conducted for one species, covering different tissues, cell states, and cell types. Under these circumstances, the challenge arises to combine results derived from the different expression profiling strategies, with the goal to uncover novel information on the basis of the integrated datasets.FINDINGS:Using our new application, MediPlEx (MEDIcago truncatula multiPLe EXpression analysis), expression data from EST experiments, oligonucleotide microarrays and Affymetrix GeneChips can be combined and analyzed, leading to a novel approach to integrated transcriptome analysis. We have validated our tool via the identification of a set of well-characterized AM-specific and AM-induced marker genes, identified by MediPlEx on the basis of in silico and experimental gene expression profiles from roots colonized with AM fungi.CONCLUSIONS:MediPlEx offers an integrated analysis pipeline for different sets of expression data generated for the model legume Medicago truncatula. As expected, in silico and experimental gene expression data that cover the same biological condition correlate well. The collection of differentially expressed genes identified via MediPlEx provides a starting point for functional studies in plant mutants. MediPlEx can freely be used at http://www.cebitec.uni-bielefeld.de/mediplex

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Publications at Bielefeld University

Expression Profile of Signal Transduction Components in a Sugarcane Population Segregating for Sugar Content

Crossref

An expanded evaluation of protein function prediction methods shows an improvement in accuracy

Author: Almeida-e-Silva DC
Altenhoff A
Babbitt PC
Bankapur AR
Bargsten JW
Ben-Hur A
Benso A
Bhat P
Bonneau R
Brenner SE
Bryson K
Cao RZ
Casadio R
Cejuela JM
Chapman S
Chen CT
Cheng JL
Cibrian-Uhalte E
Clark WT
Cozzetto D
D'Andrea D
Das S
Dawson NL
del Pozo A
Denny P
Dessimoz C
Di Carlo S
Dogan T
Dukka BKC
ElShal S
Falda M
Fang H
Feng S
Fernandez JM
Ferrari C
Fontana P
Foulger RE
Friedberg I
Funk CS
Gabaldon T
Gemovic B
Gillis J
Ginter F
Giollo M
Glisic S
Goldberg T
Gong QT
Gough J
Greene CS
Hakala K
Hamp T
Hieta R
Holm L
Hsu WL
Huntley RP
Jiang YX
Jones DT
Kaewphan S
Kahanda I
Kansakar L
Khan IK
Kihara D
Koo DCE
Koskinen P
Lavezzo E
Lee D
Lees JG
Legge D
Lepore R
Li B
Lin A
Linial M
Lovering RC
Magrane M
Maietta P
Marcet-Houben M
Martelli PL
Martin MJ
Mehryary F
Melidoni AN
Mesiti M
Minneci F
Mooney SD
Moreau Y
Mutowo-Meullenet P
Nepusz T
Ning W
O'Donovan C
Oates M
Ofer D
Orengo CA
Oron TR
Paccanaro A
Pavlidis P
Penfold-Brown D
Perovic V
Pichler K
Piovesan D
Politano G
Profiti G
Radivojac P
Rappoport N
Re M
Rehman HU
Richter L
Robinson PN
Romero AE
Rost B
Sahraeian SME
Salakoski T
Salamov A
Sasidharan R
Savino A
Sedeno-Cortes AE
Sharan M
Shasha D
Shypitsyna A
Sillitoe I
Skunca N
Smithers B
Stern A
Sternberg MJE
Supek F
Tian WD
Toppo S
Toronen P
Tosatto SCE
Tramontano A
Tranchevent LC
Tress ML
Valencia A
Valentini G
van Dijk ADJ
Veljkovic N
Veljkovic V
Vencio RZN
Verspoor KM
Vogel J
Vucetic S
Wang Z
Wass MN
Yang HX
Youngs N
Zakeri P
Zhang S
Zhong Z
Zhou YP
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/10/2022
Field of study

Background: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.Results: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.Conclusions: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent

UTUPub

Whole-genome expression profiling of Xylella fastidiosa in response to growth on glucose

Author: Alves LMC
Campanharo J. C.
Ciapina L. P.
Da Silva A. M.
da Silva ACR
Lemos EGM
Moreira L. M.
Pashalidis S.
Vencio RZN
Zaini P. A.
Publication venue: Mary Ann Liebert, Inc.
Publication date
Field of study

Xylella fastidiosa is the etiologic agent of diseases in a wide range of economically important crops including citrus variegated chlorosis, a major threat to the Brazilian citrus industry. The genomes of several strains of this phytopathogen have been completely sequenced enabling large-scale functional studies. In this work we used whole-genome DNA microarrays to investigate the transcription profile of X. fastidiosa grown in defined media with different glucose concentrations. Our analysis revealed that while transcripts related to fastidian gum production were unaffected, colicin-V-like and fimbria precursors were induced in high glucose medium. Based on these results, we suggest a model for colicin-defense mechanism in X. fastidiosa

Transcription profiling of signal transduction-related genes in sugarcane tissues

Author: Da Silva AM
Di Mauro SMZ
Felix JD
Menossi M
Oliveira KC
Papini-Terzi FS
Pereira CAD
Rocha CD
Rocha FR
Simoes ACQ
Souza GM
Ulian EC
Vencio RZN
Vicentini R
Publication venue: Inglaterra
Publication date
Field of study

A collection of 237,954 sugarcane ESTs was examined in search of signal transduction genes. Over 3,500 components involved in several aspects of signal transduction, transcription, development, cell cycle, stress responses and pathogen interaction were compiled into the Sugarcane Signal Transduction (SUCAST) Catalogue. Sequence comparisons and protein domain analysis revealed 477 receptors, 510 protein kinases, 107 protein phosphatases, 75 small GTPases, 17 G-proteins, 114 calcium and inositol metabolism proteins, and over 600 transcription factors. The elements were distributed into 29 main categories subdivided into 409 sub-categories. Genes with no matches in the public databases and of unknown function were also catalogued. A cDNA microarray was constructed to profile individual variation of plants cultivated in the field and transcript abundance in six plant organs (flowers, roots, leaves, lateral buds, and 1(st) and 4(th) internodes). From 1280 distinct elements analyzed, 217 (17%) presented differential expression in two biological samples of at least one of the tissues tested. A total of 153 genes (12%) presented highly similar expression levels in all tissues. A virtual profile matrix was constructed and the expression profiles were validated by real-time PCR. The expression data presented can aid in assigning function for the sugarcane genes and be useful for promoter characterization of this and other economically important grasses.121273

Repositorio da Producao Cientifica e Intelectual da Unicamp

Role of σ54 in the regulation of genes involved in type I and type IV pili biogenesis in Xylella fastidiosa

Author: AA Souza de
AH Purcell
AJ Simpson
CA Ball
Cecília M. Abe
D Bhaya
D Parker
DJ Studholme
DJ Studholme
EA Lang
H Barrios
H Feil
H Gil
H Rakotoarivonina
H Towbin
IL Grigorova
J Pizarro-Cerdá
JF da Silva Neto
JF da Silva Neto
JM Wells
José F. da Silva Neto
JS Mattick
KJ Livak
KL Newman
KS Ishimoto
KS Ishimoto
L Craig
L La Fuente De
L Reitzer
LC Souza
LM Moreira
M Buck
M Wolfgang
MA Sluys Van
Marilis V. Marques
MB Smolka
MJ Davis
MJ Merrick
MJ Merrick
MMSM Wosten
MR Guilhabert
MT Villar
P Gaurivaud
PA Totten
PB Monteiro
RA Alm
RR Burgess
RZN Vencio
S Campoy
S Graupner
SS Wu
Suely L. Gomes
T Koide
T Koide
Tie Koide
TM Gruber
Y Kang
Y Li
Y Meng
YH Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

An expanded evaluation of protein function prediction methods shows an improvement in accuracy

Author: Almeida-e-Silva DC
Altenhoff A
Babbitt PC
Bankapur AR
Bargsten JW
Ben-Hur A
Benso A
Bhat P
Bonneau R
Brenner SE
Bryson K
Cao R
Casadio R
Cejuela JM
Chapman S
Chen CT
Cheng J
Cibrian-Uhalte E
Clark WT
Cozzetto D
D'Andrea D
Das S
Dawson NL
del Pozo A
Denny P
Dessimoz C
Di Carlo S
Dijk ADJ
Dogan T
Dukka BKC
Elshal Sarah
Falda M
Fang H
Feng S
Fernandez JM
Ferrari C
Fontana P
Foulger RE
Friedberg I
Funk CS
Gabaldon T
Gemovic B
Gillis J
Ginter F
Giollo M
Glisic S
Goldberg T
Gong Q
Gough J
Greene CS
Hakala K
Hamp T
Hieta R
Holm L
Hsu WL
Huntley RP
Jiang Y
Jones DT
Kaewphan S
Kahanda I
Kansakar L
Khan IK
Kihara D
Koo E
Koskinen P
Lavezzo E
Lee D
Lees JG
Legge D
Lepore R
Li B
Lin A
Linial M
Lovering RC
Magrane M
Maietta P
Marcet-Houben M
Martelli PL
Martin MJ
Mehryary F
Melidoni AN
Mesiti M
Minneci F
Mooney SD
Moreau Yves
Mutowo-Meullenet P
Nepusz T
Ning W
O'Donovan C
Oates M
Ofer D
Orengo CA
Oron TR
Paccanaro A
Pavlidis P
Penfold-Brown D
Perovic V
Pichler K
Piovesan D
Politano G
Profiti G
Radivojac P
Rappoport N
Re M
Richter L
Robinson PN
Romero AE
Rost B
Sahraeian SME
Salakoski T
Salamov A
Sasidharan R
Savino A
Sedno-Cort'es AE
Sharan M
Shasha D
Shypitsyna A
Sillitoe I
Skunca N
Smithers B
Stern A
Sternberg MJE
Supek F
Tian W
Toppo S
Toronen P
Tosatto S
Tramontano A
Tranchevent Léon-Charles
Tress ML
Ur Rehman H
Valencia A
Valentini G
Veljkoivic N
Veljkovic V
Vencio RZN
Verspoor KM
Vogel J
Vucetic S
Wang Z
Wass MN
Yang H
Youngs N
Zakeri Pooya
Zhang S
Zhong Z
Zhou Y
Publication venue: BioMed Central
Publication date: 01/09/2016
Field of study

BACKGROUND: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.Submitted to Genome Biologystatus: publishe

Lirias