Search CORE

arXiv.org e-Print Archive

CERN Document Server

Simcluster: clustering enumeration gene expression data on the simplex space

Author: Carlos A de B Pereira
E Dougherty
G Stolovitzky
H Thygesen
Helena Brentani
I Braslavsky
Ilya Shmulevich
J Aitchison
J Aitchison
K Okubo
L Cai
L Hood
Leonardo Varuzza
M Bainbridge
M Brun
M de Hoon
M Gilchrist
M Margulies
M Schena
N Bolshakova
R Loganantharaj
R Page
R Vencio
R Vencio
RF Service
Ricardo ZN Vêncio
S Audic
S Brenner
S Datta
S Fodor
T Seo
V Velculescu
Publication venue
Publication date: 01/01/2007
Field of study

Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space.

Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster.

Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Repositório da Produção USP (Univ. de São Paulo)

Nature Precedings

ProbFAST: Probabilistic Functional Analysis System Tool

Author: A Ahmed
A van Kampen
B Graveley
C Johnstone
C Jones
C Romualdi
C Romualdi
C Suzuki
D Murray
E Ojima
F Rojo
F Sigoillot
Greice A Molfetta
H Li
Israel T Silva
J Lu
J Pylouster
J Rae
J Wixon
K Baggerly
K Komatsu
M Ashburner
M Howe
M Kashani-Sabet
M Schena
P Dy
P Phadke
R Vêncio
R Vêncio
R Vêncio
Ricardo ZN Vêncio
S Brenner
S Lee
T Barrett
T Fawcett
Thiago YK Oliveira
V Velculescu
V Velculescu
W Meehan
Wilson A Silva
X Cui
X Jiang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis. Results We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes. Conclusions ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at <url>http://gdm.fmrp.usp.br/probfast</url>.</p

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Repositório da Produção USP (Univ. de São Paulo)

Basic properties and information theory of Audic-Claverie statistic for analyzing cDNA arrays

Author: C Lin
C Medina
D Stekel
G Cervigni
H Kim
J Borecký
J Miles
J Ruijter
L Varuzza
M Metta
N Ge
Peter Tiňo
R Evans
R Morin
S Audic
S Bortoluzzi
S Brenner
V Velculescu
Y Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Maastricht University Research Portal

Circulating tumor DNA guided adjuvant chemotherapy in stage II colon cancer (MEDOCC-CrEATE):study protocol for a trial within a cohort study

Author: Bosch L J W
Coupé V M H
Elias S
Fijneman R J A
Koopman M
Laclé M M
Meijer G A
Phallen J
Rubio Alarcón C
Sausen M
Schraa S J
Simmons J
van den Broek D
van der Kruijssen D E W
van Grevenstein W M U
van Rooijen K L
Velculescu V E
Verkooijen H M
Vink G R
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/08/2020
Field of study

BACKGROUND: Accurate detection of patients with minimal residual disease (MRD) after surgery for stage II colon cancer (CC) remains an urgent unmet clinical need to improve selection of patients who might benefit form adjuvant chemotherapy (ACT). Presence of circulating tumor DNA (ctDNA) is indicative for MRD and has high predictive value for recurrent disease. The MEDOCC-CrEATE trial investigates how many stage II CC patients with detectable ctDNA after surgery will accept ACT and whether ACT reduces the risk of recurrence in these patients. METHODS/DESIGN: MEDOCC-CrEATE follows the 'trial within cohorts' (TwiCs) design. Patients with colorectal cancer (CRC) are included in the Prospective Dutch ColoRectal Cancer cohort (PLCRC) and give informed consent for collection of clinical data, tissue and blood samples, and consent for future randomization. MEDOCC-CrEATE is a subcohort within PLCRC consisting of 1320 stage II CC patients without indication for ACT according to current guidelines, who are randomized 1:1 into an experimental and a control arm. In the experimental arm, post-surgery blood samples and tissue are analyzed for tissue-informed detection of plasma ctDNA, using the PGDx elio™ platform. Patients with detectable ctDNA will be offered ACT consisting of 8 cycles of capecitabine plus oxaliplatin while patients without detectable ctDNA and patients in the control group will standard follow-up according to guideline. The primary endpoint is the proportion of patients receiving ACT when ctDNA is detectable after resection. The main secondary outcome is 2-year recurrence rate (RR), but also includes 5-year RR, disease free survival, overall survival, time to recurrence, quality of life and cost-effectiveness. Data will be analyzed by intention to treat. DISCUSSION: The MEDOCC-CrEATE trial will provide insight into the willingness of stage II CC patients to be treated with ACT guided by ctDNA biomarker testing and whether ACT will prevent recurrences in a high-risk population. Use of the TwiCs design provides the opportunity to randomize patients before ctDNA measurement, avoiding ethical dilemmas of ctDNA status disclosure in the control group. TRIAL REGISTRATION: Netherlands Trial Register: NL6281/NTR6455 . Registered 18 May 2017, https://www.trialregister.nl/trial/6281

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Microarrays for global expression constructed with a low redundancy set of 27,500 sequenced cDNAs representing an array of developmental stages and physiological conditions of the soybean plant

Author: A Marshall
B Boeckmann
B Ewing
B Ewing
C-S Wang
CH Wu
DL Wheeler
DR McCarty
E Shoop
F Thibaud-Nissen
G Stacey
GZ Zabala
JC Hong
JL DeRisi
KAT Silverstein
KAT Silverstein
MD Schena
MF Bonaldo
P Hedge
R Shoemaker
SF Altschul
T Maniatis
TL Maguire
V Walbot
VE Velculescu
X Huang
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: Microarrays are an important tool with which to examine coordinated gene expression. Soybean (Glycine max) is one of the most economically valuable crop species in the world food supply. In order to accelerate both gene discovery as well as hypothesis-driven research in soybean, global expression resources needed to be developed. The applications of microarray for determining patterns of expression in different tissues or during conditional treatments by dual labeling of the mRNAs are unlimited. In addition, discovery of the molecular basis of traits through examination of naturally occurring variation in hundreds of mutant lines could be enhanced by the construction and use of soybean cDNA microarrays. RESULTS: We report the construction and analysis of a low redundancy 'unigene' set of 27,513 clones that represent a variety of soybean cDNA libraries made from a wide array of source tissue and organ systems, developmental stages, and stress or pathogen-challenged plants. The set was assembled from the 5' sequence data of the cDNA clones using cluster analysis programs. The selected clones were then physically reracked and sequenced at the 3' end. In order to increase gene discovery from immature cotyledon libraries that contain abundant mRNAs representing storage protein gene families, we utilized a high density filter normalization approach to preferentially select more weakly expressed cDNAs. All 27,513 cDNA inserts were amplified by polymerase chain reaction. The amplified products, along with some repetitively spotted control or 'choice' clones, were used to produce three 9,728-element microarrays that have been used to examine tissue specific gene expression and global expression in mutant isolines. CONCLUSIONS: Global expression studies will be greatly aided by the availability of the sequence-validated and low redundancy cDNA sets described in this report. These cDNAs and ESTs represent a wide array of developmental stages and physiological conditions of the soybean plant. We also demonstrate that the quality of the data from the soybean cDNA microarrays is sufficiently reliable to examine isogenic lines that differ with respect to a mutant phenotype and thereby to define a small list of candidate genes potentially encoding or modulated by the mutant phenotype

Springer - Publisher Connector

OpenKnowledge@NAU

Public Library of Science (PLOS)

eScholarship@McGill

Unifying Gene Expression Measures from Multiple Platforms Using Factor Analysis

Author: B Efron
D Bartholomew
Elizabeth Purdom
F Collins
G Smyth
G Smyth
Illumina
J Bullard
J Marioni
K Mardia
L Shi
MD Robinson
Network TCGA
P 't Hoen
P Warnat
Paul T. Spellman
R Scharpf
R Verhaak
R Verhaak
RA Irizarry
Roel G. W. Verhaak
S Hochreiter
S Monti
S Saha
Stein Aerts
Terence P. Speed
V Velculescu
Xin Victoria Wang
Y Benjamini
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

In the Cancer Genome Atlas (TCGA) project, gene expression of the same set of samples is measured multiple times on different microarray platforms. There are two main advantages to combining these measurements. First, we have the opportunity to obtain a more precise and accurate estimate of expression levels than using the individual platforms alone. Second, the combined measure simplifies downstream analysis by eliminating the need to work with three sets of expression measures and to consolidate results from the three platforms

CiteSeerX

University of Melbourne Institutional Repository

The 3-Base Periodicity and Codon Usage of Coding Sequences Are Correlated with Gene Expression at the Level of Transcription Elongation

Author: A Marin
A Saunders
AA Tsonis
AI Nesvizhskii
B Futcher
B Irwin
C Yin
C Yin
D Zenklusen
DE Knuth
Edoardo Trotta
EN Trifonov
FC Holstege
FE Frenkel
G Cannarozzi
G Gutierrez
G Kudla
Grzegorz Kudla
H Saeki
J Dekker
J Grigull
JB Plotkin
JC Shepherd
JE Pérez-Ortín
K Juneau
KA Dittmar
M Bulmer
M Kertesz
N Stoletzki
NT Ingolia
OI Kulaeva
P Lu
PM Sharp
PM Sharp
R Hershberg
R Simic
S Boycheva
S Ghaemmaghami
S Tiwari
SG Andersson
ST Eskesen
T Ikemura
T Ikemura
V Epshtein
V Pelechano
V Pelechano
VE Velculescu
Y Arava
Y Ponty
Y Wang
Y Zhao
YY Waldman
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Background: Gene transcription is regulated by DNA transcriptional regulatory elements, promoters and enhancers that are located outside the coding regions. Here, we examine the characteristic 3-base periodicity of the coding sequences and analyse its correlation with the genome-wide transcriptional profile of yeast. Principal Findings: The analysis of coding sequences by a new class of indices proposed here identified two different sources of 3-base periodicity: the codon frequency and the codon sequence. In exponentially growing yeast cells, the codon-frequency component of periodicity accounts for 71.9 % of the variability of the cellular mRNA by a strong association with the density of elongating mRNA polymerase II complexes. The mRNA abundance explains most of the correlation between the codon-frequency component of periodicity and protein levels. Furthermore, pyrimidine-ending codons of the four-fold degenerate small amino acids alanine, glycine and valine are associated with genes with double the transcription rate of those associated with purine-ending codons. Conclusions: We demonstrate that the 3-base periodicity of coding sequences is higher than expected by the codon usage frequency (CUF) and that its components, associated with codon bias and amino acid composition, are correlated with gene expression, principally at the level of transcription elongation. This indicates a role of codon sequences in maximising the transcription efficiency in exponentially growing yeast cells. Moreover, the results contrast with the common Darwinia

CiteSeerX

Public Library of Science (PLOS)

Identification of two novel CT antigens and their capacity to elicit antibody response in hepatocellular carcinoma patients

Author: AF Kirkin
AJ Zendman
AL Menke
AM Bamberger
AO Güre
AO Güre
C Heyting
C Olesen
D Brett
D Jager
D Lopez
E De Plaen
GD Schuler
H Chen
H Sugiyama
H Towbin
H-Y Wu
K Okuda
O Tureci
P Giorgio
SJ Aylwin
T Baba
T Maehama
TD Schmittgen
U Sahin
V Martelange
VE Velculescu
W-F Chen
X-A Yang
X-P Qian
X-W Pang
X-Y Dong
Y Oka
Y Wang
Y Wu
Y-R Su
YT Chen
YT Chen
Publication venue: Nature Publishing Group
Publication date: 01/01/2003
Field of study

FATE and TPTE genes were originally reported to be specifically expressed in the adult testis. We searched for the databases of Unigene and serial analysis of gene expression ( SAGE) implying that these two gene transcripts might also be expressed in tumours. Herein, we demonstrated that FATE and TPTE mRNA transcripts were expressed in different histological types of tumours and normal testis. Both are cancer-testis (CT) antigens and renamed as FATE/BJ-HCC-2 and TPTE/BJ-HCC-5, respectively. Comparison at nucleotide sequence, the FATE/BJ-HCC-2 cDNA, was identical to that of FATE, whereas the TPTE/BJ-HCC-5 was found to have two isoforms in both cancers and testis: one was identical in cDNA sequence to TPTE, encoding a protein of 551 amino acids, and the other variant lacked an exon of 54 bp, encoding a protein of 533 amino acids. The mRNA expression was analysed by RT-PCR and real-time PCR. FATE/BJ-HCC-2 mRNA was detected in 66% ( 41 out of 62) in hepatocellular carcinoma (HCC) samples and 21% ( three out of 14) in colon cancer samples, whereas the TPTE/BJ-HCC-5 mRNA was detected in 39% ( 24 out of 62) and 36% ( five out of 14) in HCC and non-small lung cancer samples, respectively. The recombinant proteins were prepared and the reactivity of allogenic sera to these two antigens was screened. The frequency of antibody response against FATE/BJ-HCC-2 and TPTE/BJ-HCC-5 proteins was 7.3% ( three out of 41) and 25.0% ( six out of 24), respectively, in HCC patients bearing respective gene transcripts. Therefore, FATE/BJ-HCC-2 and TPTE/BJ-HCC-5 are the novel CT antigens capable of eliciting antibody response in cancer patients.OncologySCI(E)PubMed22ARTICLE2291-2978