260 research outputs found
Zipf's Law in Gene Expression
Using data from gene expression databases on various organisms and tissues,
including yeast, nematodes, human normal and cancer tissues, and embryonic stem
cells, we found that the abundances of expressed genes exhibit a power-law
distribution with an exponent close to -1, i.e., they obey Zipf's law.
Furthermore, by simulations of a simple model with an intra-cellular reaction
network, we found that Zipf's law of chemical abundance is a universal feature
of cells where such a network optimizes the efficiency and faithfulness of
self-reproduction. These findings provide novel insights into the nature of the
organization of reaction dynamics in living cells.Comment: revtex, 11 pages, 3 figures, submitted to Phys. Rev. Let
Simcluster: clustering enumeration gene expression data on the simplex space
Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space.

Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster.

Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data
ProbFAST: Probabilistic Functional Analysis System Tool
<p>Abstract</p> <p>Background</p> <p>The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis.</p> <p>Results</p> <p>We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes.</p> <p>Conclusions</p> <p>ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at <url>http://gdm.fmrp.usp.br/probfast</url>.</p
Circulating tumor DNA guided adjuvant chemotherapy in stage II colon cancer (MEDOCC-CrEATE):study protocol for a trial within a cohort study
BACKGROUND: Accurate detection of patients with minimal residual disease (MRD) after surgery for stage II colon cancer (CC) remains an urgent unmet clinical need to improve selection of patients who might benefit form adjuvant chemotherapy (ACT). Presence of circulating tumor DNA (ctDNA) is indicative for MRD and has high predictive value for recurrent disease. The MEDOCC-CrEATE trial investigates how many stage II CC patients with detectable ctDNA after surgery will accept ACT and whether ACT reduces the risk of recurrence in these patients. METHODS/DESIGN: MEDOCC-CrEATE follows the 'trial within cohorts' (TwiCs) design. Patients with colorectal cancer (CRC) are included in the Prospective Dutch ColoRectal Cancer cohort (PLCRC) and give informed consent for collection of clinical data, tissue and blood samples, and consent for future randomization. MEDOCC-CrEATE is a subcohort within PLCRC consisting of 1320 stage II CC patients without indication for ACT according to current guidelines, who are randomized 1:1 into an experimental and a control arm. In the experimental arm, post-surgery blood samples and tissue are analyzed for tissue-informed detection of plasma ctDNA, using the PGDx elio™ platform. Patients with detectable ctDNA will be offered ACT consisting of 8 cycles of capecitabine plus oxaliplatin while patients without detectable ctDNA and patients in the control group will standard follow-up according to guideline. The primary endpoint is the proportion of patients receiving ACT when ctDNA is detectable after resection. The main secondary outcome is 2-year recurrence rate (RR), but also includes 5-year RR, disease free survival, overall survival, time to recurrence, quality of life and cost-effectiveness. Data will be analyzed by intention to treat. DISCUSSION: The MEDOCC-CrEATE trial will provide insight into the willingness of stage II CC patients to be treated with ACT guided by ctDNA biomarker testing and whether ACT will prevent recurrences in a high-risk population. Use of the TwiCs design provides the opportunity to randomize patients before ctDNA measurement, avoiding ethical dilemmas of ctDNA status disclosure in the control group. TRIAL REGISTRATION: Netherlands Trial Register: NL6281/NTR6455 . Registered 18 May 2017, https://www.trialregister.nl/trial/6281
Microarrays for global expression constructed with a low redundancy set of 27,500 sequenced cDNAs representing an array of developmental stages and physiological conditions of the soybean plant
BACKGROUND: Microarrays are an important tool with which to examine coordinated gene expression. Soybean (Glycine max) is one of the most economically valuable crop species in the world food supply. In order to accelerate both gene discovery as well as hypothesis-driven research in soybean, global expression resources needed to be developed. The applications of microarray for determining patterns of expression in different tissues or during conditional treatments by dual labeling of the mRNAs are unlimited. In addition, discovery of the molecular basis of traits through examination of naturally occurring variation in hundreds of mutant lines could be enhanced by the construction and use of soybean cDNA microarrays. RESULTS: We report the construction and analysis of a low redundancy 'unigene' set of 27,513 clones that represent a variety of soybean cDNA libraries made from a wide array of source tissue and organ systems, developmental stages, and stress or pathogen-challenged plants. The set was assembled from the 5' sequence data of the cDNA clones using cluster analysis programs. The selected clones were then physically reracked and sequenced at the 3' end. In order to increase gene discovery from immature cotyledon libraries that contain abundant mRNAs representing storage protein gene families, we utilized a high density filter normalization approach to preferentially select more weakly expressed cDNAs. All 27,513 cDNA inserts were amplified by polymerase chain reaction. The amplified products, along with some repetitively spotted control or 'choice' clones, were used to produce three 9,728-element microarrays that have been used to examine tissue specific gene expression and global expression in mutant isolines. CONCLUSIONS: Global expression studies will be greatly aided by the availability of the sequence-validated and low redundancy cDNA sets described in this report. These cDNAs and ESTs represent a wide array of developmental stages and physiological conditions of the soybean plant. We also demonstrate that the quality of the data from the soybean cDNA microarrays is sufficiently reliable to examine isogenic lines that differ with respect to a mutant phenotype and thereby to define a small list of candidate genes potentially encoding or modulated by the mutant phenotype
Unifying Gene Expression Measures from Multiple Platforms Using Factor Analysis
In the Cancer Genome Atlas (TCGA) project, gene expression of the same set of samples is measured multiple times on different microarray platforms. There are two main advantages to combining these measurements. First, we have the opportunity to obtain a more precise and accurate estimate of expression levels than using the individual platforms alone. Second, the combined measure simplifies downstream analysis by eliminating the need to work with three sets of expression measures and to consolidate results from the three platforms
The 3-Base Periodicity and Codon Usage of Coding Sequences Are Correlated with Gene Expression at the Level of Transcription Elongation
Background: Gene transcription is regulated by DNA transcriptional regulatory elements, promoters and enhancers that are located outside the coding regions. Here, we examine the characteristic 3-base periodicity of the coding sequences and analyse its correlation with the genome-wide transcriptional profile of yeast. Principal Findings: The analysis of coding sequences by a new class of indices proposed here identified two different sources of 3-base periodicity: the codon frequency and the codon sequence. In exponentially growing yeast cells, the codon-frequency component of periodicity accounts for 71.9 % of the variability of the cellular mRNA by a strong association with the density of elongating mRNA polymerase II complexes. The mRNA abundance explains most of the correlation between the codon-frequency component of periodicity and protein levels. Furthermore, pyrimidine-ending codons of the four-fold degenerate small amino acids alanine, glycine and valine are associated with genes with double the transcription rate of those associated with purine-ending codons. Conclusions: We demonstrate that the 3-base periodicity of coding sequences is higher than expected by the codon usage frequency (CUF) and that its components, associated with codon bias and amino acid composition, are correlated with gene expression, principally at the level of transcription elongation. This indicates a role of codon sequences in maximising the transcription efficiency in exponentially growing yeast cells. Moreover, the results contrast with the common Darwinia
Identification of two novel CT antigens and their capacity to elicit antibody response in hepatocellular carcinoma patients
FATE and TPTE genes were originally reported to be specifically expressed in the adult testis. We searched for the databases of Unigene and serial analysis of gene expression ( SAGE) implying that these two gene transcripts might also be expressed in tumours. Herein, we demonstrated that FATE and TPTE mRNA transcripts were expressed in different histological types of tumours and normal testis. Both are cancer-testis (CT) antigens and renamed as FATE/BJ-HCC-2 and TPTE/BJ-HCC-5, respectively. Comparison at nucleotide sequence, the FATE/BJ-HCC-2 cDNA, was identical to that of FATE, whereas the TPTE/BJ-HCC-5 was found to have two isoforms in both cancers and testis: one was identical in cDNA sequence to TPTE, encoding a protein of 551 amino acids, and the other variant lacked an exon of 54 bp, encoding a protein of 533 amino acids. The mRNA expression was analysed by RT-PCR and real-time PCR. FATE/BJ-HCC-2 mRNA was detected in 66% ( 41 out of 62) in hepatocellular carcinoma (HCC) samples and 21% ( three out of 14) in colon cancer samples, whereas the TPTE/BJ-HCC-5 mRNA was detected in 39% ( 24 out of 62) and 36% ( five out of 14) in HCC and non-small lung cancer samples, respectively. The recombinant proteins were prepared and the reactivity of allogenic sera to these two antigens was screened. The frequency of antibody response against FATE/BJ-HCC-2 and TPTE/BJ-HCC-5 proteins was 7.3% ( three out of 41) and 25.0% ( six out of 24), respectively, in HCC patients bearing respective gene transcripts. Therefore, FATE/BJ-HCC-2 and TPTE/BJ-HCC-5 are the novel CT antigens capable of eliciting antibody response in cancer patients.OncologySCI(E)PubMed22ARTICLE2291-2978
- …