424 research outputs found
Editorial:THE RICH DIVERSITY OF GENOMICS—A REPORT ON THE ‘COMPARATIVE AND FUNCTIONAL GENOMICS (BITS) WORKSHOP’, HINXTON, UK, 27–30 OCTOBER 2005
Identification of Cancer Related Genes Using a Comprehensive Map of Human Gene Expression
Rapid accumulation and availability of gene expression datasets in public repositories have enabled large-scale meta-analyses of combined data. The richness of cross-experiment data has provided new biological insights, including identification of new cancer genes. In this study, we compiled a human gene expression dataset from ∼40,000 publicly available Affymetrix HG-U133Plus2 arrays. After strict quality control and data normalisation the data was quantified in an expression matrix of ∼20,000 genes and ∼28,000 samples. To enable different ways of sample grouping, existing annotations where subjected to systematic ontology assisted categorisation and manual curation. Groups like normal tissues, neoplasmic tissues, cell lines, homoeotic cells and incompletely differentiated cells were created. Unsupervised analysis of the data confirmed global structure of expression consistent with earlier analysis but with more details revealed due to increased resolution. A suitable mixed-effects linear model was used to further investigate gene expression in solid tissue tumours, and to compare these with the respective healthy solid tissues. The analysis identified 1,285 genes with systematic expression change in cancer. The list is significantly enriched with known cancer genes from large, public, peer-reviewed databases, whereas the remaining ones are proposed as new cancer gene candidates. The compiled dataset is publicly available in the ArrayExpress Archive. It contains the most diverse collection of biological samples, making it the largest systematically annotated gene expression dataset of its kind in the public domai
clustComp, a bioconductor package for the comparison of clustering results
clustComp is an open source Bioconductor package that implements different techniques for the comparison of two gene expression clustering results. These include flat versus flat
and hierarchical versus flat comparisons. The visualization of the similarities is provided by means
of a bipartite graph, whose layout is heuristically optimized. Its flexibility allows a suitable visualization for both small and large datasets.This work was supported by the Ramón Areces Foundation
Predicting Gene Regulatory Elements in Silico on a Genomic Scale
We performed a systematic analysis of gene upstream regions in the yeast genome for occurrences of regular expression-type patterns with the goal of identifying potential regulatory elements. To achieve this goal, we have developed a new sequence pattern discovery algorithm that searches exhaustively for a priori unknown regular expression-type patterns that are over-represented in a given set of sequences. We applied the algorithm in two cases, (1) discovery of patterns in the complete set of >6000 sequences taken upstream of the putative yeast genes and (2) discovery of patterns in the regions upstream of the genes with similar expression profiles. In the first case, we looked for patterns that occur more frequently in the gene upstream regions than in the genome overall. In the second case, first we clustered the upstream regions of all the genes by similarity of their expression profiles on the basis of publicly available gene expression data and then looked for sequence patterns that are over-represented in each cluster. In both cases we considered each pattern that occurred at least in some minimum number of sequences, and rated them on the basis of their over-representation. Among the highest rating patterns, most have matches to substrings in known yeast transcription factor-binding sites. Moreover, several of them are known to be relevant to the expression of the genes from the respective clusters. Experiments on simulated data show that the majority of the discovered patterns are not expected to occur by chance
Genomic clustering and co-regulation of transcriptional networks in the pathogenic fungus Fusarium graminearum.
BACKGROUND: Genes for the production of a broad range of fungal secondary metabolites are frequently colinear. The prevalence of such gene clusters was systematically examined across the genome of the cereal pathogen Fusarium graminearum. The topological structure of transcriptional networks was also examined to investigate control mechanisms for mycotoxin biosynthesis and other processes. RESULTS: The genes associated with transcriptional processes were identified, and the genomic location of transcription-associated proteins (TAPs) analyzed in conjunction with the locations of genes exhibiting similar expression patterns. Highly conserved TAPs reside in regions of chromosomes with very low or no recombination, contrasting with putative regulator genes. Co-expression group profiles were used to define positionally clustered genes and a number of members of these clusters encode proteins participating in secondary metabolism. Gene expression profiles suggest there is an abundance of condition-specific transcriptional regulation. Analysis of the promoter regions of co-expressed genes showed enrichment for conserved DNA-sequence motifs. Potential global transcription factors recognising these motifs contain distinct sets of DNA-binding domains (DBDs) from those present in local regulators. CONCLUSIONS: Proteins associated with basal transcriptional functions are encoded by genes enriched in regions of the genome with low recombination. Systematic searches revealed dispersed and compact clusters of co-expressed genes, often containing a transcription factor, and typically containing genes involved in biosynthetic pathways. Transcriptional networks exhibit a layered structure in which the position in the hierarchy of a regulator is closely linked to the DBD structural class
Transcriptome analysis of human tissues and cell lines reveals one dominant transcript per gene
Subtype-specific micro-RNA expression signatures in breast cancer progression.
Robust markers of invasiveness may help reduce the overtreatment of in situ carcinomas. Breast cancer is a heterogeneous disease and biological mechanisms for carcinogenesis vary between subtypes. Stratification by subtype is therefore necessary to identify relevant and robust signatures of invasive disease. We have identified microRNA (miRNA) alterations during breast cancer progression in two separate datasets and used stratification and external validation to strengthen the findings. We analyzed two separate datasets (METABRIC and AHUS) consisting of a total of 186 normal breast tissue samples, 18 ductal carcinoma in situ (DCIS) and 1,338 invasive breast carcinomas. Validation in a separate dataset and stratification by molecular subtypes based on immunohistochemistry, PAM50 and integrated cluster classifications were performed. We propose subtype-specific miRNA signatures of invasive carcinoma and a validated signature of DCIS. miRNAs included in the invasive signatures include downregulation of miR-139-5p in aggressive subtypes and upregulation of miR-29c-5p expression in the luminal subtypes. No miRNAs were differentially expressed in the transition from DCIS to invasive carcinomas on the whole, indicating the need for subtype stratification. A total of 27 miRNAs were included in our proposed DCIS signature. Significant alterations of expression included upregulation of miR-21-5p and the miR-200 family and downregulation of let-7 family members in DCIS samples. The signatures proposed here can form the basis for studies exploring DCIS samples with increased invasive potential and serum biomarkers for in situ and invasive breast cancer.This work was performed as part of the EurocanPlatform which has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement No. 260791. Portions of this research (Venn diagram creator) were supported by the W.R. Wiley Environmental Molecular Science Laboratory, a national scientific user facility sponsored by the U.S. Department of Energy's Office of Biological and Environmental Research and located at PNNL. PNNL is operated by Battelle Memorial Institute for the U.S. Department of Energy under contract DE-AC05-76RL0 1830.This is the author accepted manuscript. The final version is available from Wiley via http://dx.doi.org/10.1002/ijc.3014
- …
