38 research outputs found
h-Profile plots for the discovery and exploration of patterns in gene expression data with an application to time course data
<p>Abstract</p> <p>Background</p> <p>An ever increasing number of techniques are being used to find genes with similar profiles from microarray studies. Visualization of gene expression profiles can aid this process, potentially contributing to the identification of co-regulated genes and gene function as well as network development.</p> <p>Results</p> <p>We introduce the h-Profile plot to display gene expression profiles. Thumbnail versions of plots of gene expression profiles are plotted at coordinates such that profiles of similar shape are located in the same sector, with decreasing variance towards the origin. Negatively correlated profiles can easily be identified. A new method for selecting genes with fixed periodicity, but different phase and amplitude is described and used to demonstrate the use of the plots on cell cycle data.</p> <p>Conclusion</p> <p>Visualization tools for gene expression data are important and h-profile plots provide a timely contribution to the field. They allow the simultaneous visualization of many gene expression profiles and can be used for the identification of genes with similar or reversed profiles, the foundation step in many analyses.</p
Theoretical measures of relative performance of classifiers for high dimensional data with small sample sizes
We suggest a technique, related to the concept of 'detection boundary' that was developed by Ingster and by Donoho and Jin, for comparing the theoretical performance of classifiers constructed from small training samples of very large vectors. The resulting 'classification boundaries' are obtained for a variety of distance-based methods, including the support vector machine, distance-weighted discrimination and kth-nearest-neighbour classifiers, for thresholded forms of those methods, and for techniques based on Donoho and Jin's higher criticism approach to signal detection. Assessed in these terms, standard distance-based methods are shown to be capable only of detecting differences between populations when those differences can be estimated consistently. However, the thresholded forms of distance-based classifiers can do better, and in particular can correctly classify data even when differences between distributions are only detectable, not estimable. Other methods, including higher criticism classifiers, can on occasion perform better still, but they tend to be more limited in scope, requiring substantially more information about the marginal distributions. Moreover, as tail weight becomes heavier the classification boundaries of methods designed for particular distribution types can converge to, and achieve, the boundary for thresholded nearest neighbour approaches. For example, although higher criticism has a lower classification boundary, and in this sense performs better, in the case of normal data, the boundaries are identical for exponentially distributed data when both sample sizes equal 1
Simpler Evaluation of Predictions and Signature Stability for Gene Expression Data
Scientific advances are raising expectations that patient-tailored treatment will soon be available. The development of resulting clinical approaches needs to be based on well-designed experimental and observational procedures that provide data to which proper biostatistical analyses are applied. Gene expression microarray and related technology are rapidly evolving. It is providing extremely large gene expression profiles containing many thousands of measurements. Choosing a subset from these gene expression measurements to include in a gene expression signature is one of the many challenges needing to be met. Choice of this signature depends on many factors, including the selection of patients in the training set. So the reliability and reproducibility of the resultant prognostic gene signature needs to be evaluated, in such a way as to be relevant to the clinical setting. A relatively straightforward approach is based on cross validation, with separate selection of genes at each iteration to avoid selection bias. Within this approach we developed two different methods, one based on forward selection, the other on genes that were statistically significant in all training blocks of data. We demonstrate our approach to gene signature evaluation with a well-known breast cancer data set
Epigenetic regulation of the honey bee transcriptome: unravelling the nature of methylated genes
Background: Epigenetic modification of DNA via methylation is one of the key inventions in eukaryotic evolution. It provides a source for the switching of gene activities, the maintenance of stable phenotypes and the integration of environmental and genomic signals. Although this process\ud
is widespread among eukaryotes, both the patterns of methylation and their relevant biological roles not only vary noticeably in different lineages, but often are poorly understood. In addition, the evolutionary origins of DNA methylation in multicellular organisms remain enigmatic. Here we used a new 'epigenetic' model, the social honey bee Apis mellifera, to gain insights into the significance of methylated genes.\ud
\ud
Results: We combined microarray profiling of several tissues with genome-scale bioinformatics and bisulfite sequencing of selected genes to study the honey bee methylome. We find that around 35% of the annotated honey bee genes are expected to be methylated at the CpG dinucleotides by a highly conserved DNA methylation system. We show that one unifying feature of the methylated genes in this species is their broad pattern of expression and the associated 'housekeeping' roles. In contrast, genes involved in more stringently regulated spatial or temporal functions are predicted to be un-methylated.\ud
\ud
Conclusion: Our data suggest that honey bees use CpG methylation of intragenic regions as an epigenetic mechanism to control the levels of activity of the genes that are broadly expressed and might be needed for conserved core biological processes in virtually every type of cell. We discuss the implications of our findings for genome-scale regulatory network structures and the evolution\ud
of the role(s) of DNA methylation in eukaryotes. Our findings are particularly important in the context of the emerging evidence that environmental factors can influence the epigenetic settings of some genes and lead to serious metabolic and behavioural disorders
Tax compliance by the very wealthy: Red flags of risk
Executive summary: A study of 235 High Wealth Individuals (HWIs) and the entities they control was undertaken on 1997 and 1998 tax returns. From this data, and using a list of 207 candidate issues, five red flags for overall risk of aggressive tax planning by HWIs were identified. These red flags indicated recurrent risks that can be predicted using different kinds of analyses of overall high risk. ..
Impairment of organ-specific T cell negative selection by diabetes susceptibility genes: genomic analysis by mRNA profiling
BACKGROUND T cells in the thymus undergo opposing positive and negative selection processes so that the only T cells entering circulation are those bearing a T cell receptor (TCR) with a low affinity for self. The mechanism differentiating negative from positive selection is poorly understood, despite the fact that inherited defects in negative selection underlie organ-specific autoimmune disease in AIRE-deficient people and the non-obese diabetic (NOD) mouse strain RESULTS Here we use homogeneous populations of T cells undergoing either positive or negative selection in vivo together with genome-wide transcription profiling on microarrays to identify the gene expression differences underlying negative selection to an Aire-dependent organ-specific antigen, including the upregulation of a genomic cluster in the cytogenetic band 2F. Analysis of defective negative selection in the autoimmune-prone NOD strain demonstrates a global impairment in the induction of the negative selection response gene set, but little difference in positive selection response genes. Combining expression differences with genetic linkage data, we identify differentially expressed candidate genes, including Bim, Bnip3, Smox, Pdrg1, Id1, Pdcd1, Ly6c, Pdia3, Trim30 and Trim12. CONCLUSION The data provide a molecular map of the negative selection response in vivo and, by analysis of deviations from this pathway in the autoimmune susceptible NOD strain, suggest that susceptibility arises from small expression differences in genes acting at multiple points in the pathway between the TCR and cell death.This work was supported by grants from the NHMRC and the Juvenile Diabetes Research Foundation
Vascular microarray profiling in two models of hypertension identifies caveolin-1, Rgs2 and Rgs5 as antihypertensive targets
BACKGROUND:
Hypertension is a complex disease with many contributory genetic and environmental factors. We aimed to identify common targets for therapy by gene expression profiling of a resistance artery taken from animals representing two different models of hypertension. We studied gene expression and morphology of a saphenous artery branch in normotensive WKY rats, spontaneously hypertensive rats (SHR) and adrenocorticotropic hormone (ACTH)-induced hypertensive rats.
RESULTS: Differential remodeling of arteries occurred in SHR and ACTH-treated rats, involving changes in both smooth muscle and endothelium. Increased expression of smooth muscle cell growth promoters and decreased expression of growth suppressors confirmed smooth muscle cell proliferation in SHR but not in ACTH. Differential gene expression between arteries from the two hypertensive models extended to the renin-angiotensin system, MAP kinase pathways, mitochondrial activity, lipid metabolism, extracellular matrix and calcium handling. In contrast, arteries from both hypertensive models exhibited significant increases in caveolin-1 expression and decreases in the regulators of G-protein signalling, Rgs2 and Rgs5. Increased protein expression of caveolin-1 and increased incidence of caveolae was found in both smooth muscle and endothelial cells of arteries from both hypertensive models.
CONCLUSION:
We conclude that the majority of differences in gene expression found in the saphenous artery taken from rats with two different forms of hypertension reflect distinctive morphological and physiological alterations. However, changes in common to caveolin-1 expression and G protein signalling, through attenuation of Rgs2 and Rgs5, may contribute to hypertension through augmentation of vasoconstrictor pathways and provide potential targets for common drug development
Adsorption models of hybridization and post-hybridisation behaviour on oligonucleotide microarrays
Analysis of data from an Affymetrix Latin Square spike-in experiment
indicates that measured fluorescence intensities of features on an
oligonucleotide microarray are related to spike-in RNA target concentrations
via a hyperbolic response function, generally identified as a Langmuir
adsorption isotherm. Furthermore the asymptotic signal at high spike-in
concentrations is almost invariably lower for a mismatch feature than for its
partner perfect match feature. We survey a number of theoretical adsorption
models of hybridization at the microarray surface and find that in general they
are unable to explain the differing saturation responses of perfect and
mismatch features. On the other hand, we find that a simple and consistent
explanation can be found in a model in which equilibrium hybridization followed
by partial dissociation of duplexes during the post-hybridization washing
phase.Comment: 26 pages, 6 figures, some rearrangement of sections and some
additions. To appear in J.Phys.(condensed matter