6,283 research outputs found
GenomeGraphs: integrated genomic data visualization with R.
BackgroundBiological studies involve a growing number of distinct high-throughput experiments to characterize samples of interest. There is a lack of methods to visualize these different genomic datasets in a versatile manner. In addition, genomic data analysis requires integrated visualization of experimental data along with constantly changing genomic annotation and statistical analyses.ResultsWe developed GenomeGraphs, as an add-on software package for the statistical programming environment R, to facilitate integrated visualization of genomic datasets. GenomeGraphs uses the biomaRt package to perform on-line annotation queries to Ensembl and translates these to gene/transcript structures in viewports of the grid graphics package. This allows genomic annotation to be plotted together with experimental data. GenomeGraphs can also be used to plot custom annotation tracks in combination with different experimental data types together in one plot using the same genomic coordinate system.ConclusionGenomeGraphs is a flexible and extensible software package which can be used to visualize a multitude of genomic datasets within the statistical programming environment R
Temporal patterns of gene expression via nonmetric multidimensional scaling analysis
Motivation: Microarray experiments result in large scale data sets that
require extensive mining and refining to extract useful information. We have
been developing an efficient novel algorithm for nonmetric multidimensional
scaling (nMDS) analysis for very large data sets as a maximally unsupervised
data mining device. We wish to demonstrate its usefulness in the context of
bioinformatics. In our motivation is also an aim to demonstrate that
intrinsically nonlinear methods are generally advantageous in data mining.
Results: The Pearson correlation distance measure is used to indicate the
dissimilarity of the gene activities in transcriptional response of cell
cycle-synchronized human fibroblasts to serum [Iyer et al., Science vol. 283,
p83 (1999)]. These dissimilarity data have been analyzed with our nMDS
algorithm to produce an almost circular arrangement of the genes. The temporal
expression patterns of the genes rotate along this circular arrangement. If an
appropriate preparation procedure may be applied to the original data set,
linear methods such as the principal component analysis (PCA) could achieve
reasonable results, but without data preprocessing linear methods such as PCA
cannot achieve a useful picture. Furthermore, even with an appropriate data
preprocessing, the outcomes of linear procedures are not as clearcut as those
by nMDS without preprocessing.Comment: 11 pages, 6 figures + online only 2 color figures, submitted to
Bioinformatic
A Vacation
We were going to Hawaii for a rest. The doctor had said we needed a short vacation, but that was his idea, not ours
- …
