8 research outputs found
Clustering-based approaches to SAGE data mining
Serial analysis of gene expression (SAGE) is one of the most powerful tools for global gene expression profiling. It has led to several biological discoveries and biomedical applications, such as the prediction of new gene functions and the identification of biomarkers in human cancer research. Clustering techniques have become fundamental approaches in these applications. This paper reviews relevant clustering techniques specifically designed for this type of data. It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation
A human glomerular SAGE transcriptome database
Background: To facilitate in the identification of gene products important in regulating renal glomerular structure and function, we have produced an annotated transcriptome database for normal human glomeruli using the SAGE approach. Description: The database contains 22,907 unique SAGE tag sequences, with a total tag count of 48,905. For each SAGE tag, the ratio of its frequency in glomeruli relative to that in 115 non-glomerular tissues or cells, a measure of transcript enrichment in glomeruli, was calculated. A total of 133 SAGE tags representing well-characterized transcripts were enriched 10-fold or more in glomeruli compared to other tissues. Comparison of data from this study with a previous human glomerular Sau3A-anchored SAGE library reveals that 47 of the highly enriched transcripts are common to both libraries. Among these are the SAGE tags representing many podocyte-predominant transcripts like WT-1, podocin and synaptopodin. Enrichment of podocyte transcript tags SAGE library indicates that other SAGE tags observed at much higher frequencies in this glomerular compared to non-glomerular SAGE libraries are likely to be glomerulus-predominant. A higher level of mRNA expression for 19 transcripts represented by glomerulus-enriched SAGE tags was verified by RT-PCR comparing glomeruli to lung, liver and spleen. Conclusions: The database can be retrieved from, or interrogated online at http://cgap.nci.nih.gov/SAGE. The annotated database is also provided as an additional file with gene identification for 9,022, and matches to the human genome or transcript homologs in other species for 1,433 tags. It should be a useful tool for in silico mining of glomerular gene expression
Massive-Scale RNA-Seq Analysis of Non Ribosomal Transcriptome in Human Trisomy 21
Hybridization- and tag-based technologies have been successfully used in Down
syndrome to identify genes involved in various aspects of the pathogenesis.
However, these technologies suffer from several limits and drawbacks and, to
date, information about rare, even though relevant, RNA species such as long and
small non-coding RNAs, is completely missing. Indeed, none of published works
has still described the whole transcriptional landscape of Down syndrome.
Although the recent advances in high-throughput RNA sequencing have revealed the
complexity of transcriptomes, most of them rely on polyA enrichment protocols,
able to detect only a small fraction of total RNA content. On the opposite end,
massive-scale RNA sequencing on rRNA-depleted samples allows the survey of the
complete set of coding and non-coding RNA species, now emerging as novel
contributors to pathogenic mechanisms. Hence, in this work we analysed for the
first time the complete transcriptome of human trisomic endothelial progenitor
cells to an unprecedented level of resolution and sensitivity by RNA-sequencing.
Our analysis allowed us to detect differential expression of even low expressed
genes crucial for the pathogenesis, to disclose novel regions of active
transcription outside yet annotated loci, and to investigate a
plethora of non-polyadenilated long as well as short non coding RNAs. Novel
splice isoforms for a large subset of crucial genes, and novel extended
untranslated regions for known genes—possibly novel miRNA targets or
regulatory sites for gene transcription—were also identified in this
study. Coupling the rRNA depletion of samples, followed by high-throughput
RNA-sequencing, to the easy availability of these cells renders this approach
very feasible for transcriptome studies, offering the possibility of
investigating in-depth blood-related pathological features of Down syndrome, as
well as other genetic disorders
Identification of nephropathy candidate genes by comparing sclerosis-prone and sclerosis-resistant mouse strain kidney transcriptomes
Modification of nitrogen remobilisation, grain fill and leaf senescence in maize (Zea mays L.) by transposon insertional mutagenensis in a protease gene
A maize (Zea mays) senescence-associated legumain gene, See2β, was characterized at the physiological and molecular levels to determine its role in senescence and resource allocation. A reverse-genetics screen of a maize Mutator (Mu) population identified a Mu insertion in See2β. Maize plants homozygous for the insertion were produced. These See2 mutant and sibling wild-type plants were grown under high or low quantities of nitrogen (N). The early development of both genotypes was similar; however, tassel tip and collar emergence occurred earlier in the mutant. Senescence of the mutant leaves followed a similar pattern to that of wild-type leaves, but at later sampling points mutant plants contained more chlorophyll than wild-type plants and showed a small extension in photosynthetic activity. Total plant weight was higher in the wild-type than in the mutant, and there was a genotype × N interaction. Mutant plants under low N maintained cob weight, in contrast to wild-type plants under the same treatment. It is concluded, on the basis of transposon mutagenesis, that See2β has an important role in N-use and resource allocation under N-limited conditions, and a minor but significant function in the later stages of senescence