2,584 research outputs found
Physico-chemical foundations underpinning microarray and next-generation sequencing experiments
Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized
Probe set algorithms: is there a rational best bet?
Affymetrix microarrays have become a standard experimental platform for studies of mRNA expression profiling. Their success is due, in part, to the multiple oligonucleotide features (probes) against each transcript (probe set). This multiple testing allows for more robust background assessments and gene expression measures, and has permitted the development of many computational methods to translate image data into a single normalized "signal" for mRNA transcript abundance. There are now many probe set algorithms that have been developed, with a gradual movement away from chip-by-chip methods (MAS5), to project-based model-fitting methods (dCHIP, RMA, others). Data interpretation is often profoundly changed by choice of algorithm, with disoriented biologists questioning what the "accurate" interpretation of their experiment is. Here, we summarize the debate concerning probe set algorithms. We provide examples of how changes in mismatch weight, normalizations, and construction of expression ratios each dramatically change data interpretation. All interpretations can be considered as computationally appropriate, but with varying biological credibility. We also illustrate the performance of two new hybrid algorithms (PLIER, GC-RMA) relative to more traditional algorithms (dCHIP, MAS5, Probe Profiler PCA, RMA) using an interactive power analysis tool. PLIER appears superior to other algorithms in avoiding false positives with poorly performing probe sets. Based on our interpretation of the literature, and examples presented here, we suggest that the variability in performance of probe set algorithms is more dependent upon assumptions regarding "background", than on calculations of "signal". We argue that "background" is an enormously complex variable that can only be vaguely quantified, and thus the "best" probe set algorithm will vary from project to project
The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures
Motivation: Biomarker discovery from high-dimensional data is a crucial
problem with enormous applications in biology and medicine. It is also
extremely challenging from a statistical viewpoint, but surprisingly few
studies have investigated the relative strengths and weaknesses of the plethora
of existing feature selection methods. Methods: We compare 32 feature selection
methods on 4 public gene expression datasets for breast cancer prognosis, in
terms of predictive performance, stability and functional interpretability of
the signatures they produce. Results: We observe that the feature selection
method has a significant influence on the accuracy, stability and
interpretability of signatures. Simple filter methods generally outperform more
complex embedded or wrapper methods, and ensemble feature selection has
generally no positive effect. Overall a simple Student's t-test seems to
provide the best results. Availability: Code and data are publicly available at
http://cbio.ensmp.fr/~ahaury/
Deferoxamine Preconditioning of Irradiated Tissue Improves Perfusion and Fat Graft Retention
BackgroundRadiation therapy is a mainstay in the treatment of many malignancies, but collateral damage to surrounding tissue, with resultant hypovascularity, fibrosis, and atrophy, can be difficult to reconstruct. Fat grafting has been shown to improve the quality of irradiated skin, but volume retention of the graft is significantly decreased. Deferoxamine is a U.S. Food and Drug Administration-approved iron-chelating medication for acute iron intoxication and chronic iron overload that has also been shown to increase angiogenesis. The present study evaluates the effects of deferoxamine treatment on irradiated skin and subsequent fat graft volume retention.MethodsMice underwent irradiation to the scalp followed by treatment with deferoxamine or saline and perfusion and were analyzed using laser Doppler analysis. Human fat grafts were then placed beneath the scalp and retention was also followed up to 8 weeks radiographically. Finally, histologic evaluation of overlying skin was performed to evaluate the effects of deferoxamine preconditioning.ResultsTreatment with deferoxamine resulted in significantly increased perfusion, as demonstrated by laser Doppler analysis and CD31 immunofluorescent staining (p < 0.05). Increased dermal thickness and collagen content secondary to irradiation, however, were not affected by deferoxamine (p > 0.05). Importantly, fat graft volume retention was significantly increased when the irradiated recipient site was preconditioned with deferoxamine (p < 0.05).ConclusionsThe authors' results demonstrated increased perfusion with deferoxamine treatment, which was also associated with improved fat graft volume retention. Preconditioning with deferoxamine may thus enhance fat graft outcomes for soft-tissue reconstruction following radiation therapy
PINTA: a web server for network-based gene prioritization from expression data
PINTA (available at http://www.esat.kuleuven.be/pinta/; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes based on the differential expression of their neighborhood in a genome-wide protein–protein interaction network. Our strategy is meant for biological and medical researchers aiming at identifying novel disease genes using disease specific expression data. PINTA supports both candidate gene prioritization (starting from a user defined set of candidate genes) as well as genome-wide gene prioritization and is available for five species (human, mouse, rat, worm and yeast). As input data, PINTA only requires disease specific expression data, whereas various platforms (e.g. Affymetrix) are supported. As a result, PINTA computes a gene ranking and presents the results as a table that can easily be browsed and downloaded by the user
Recommended from our members
PILGRM: an interactive data-driven discovery platform for expert biologists
PILGRM (the platform for interactive learning by genomics results mining) puts advanced supervised analysis techniques applied to enormous gene expression compendia into the hands of bench biologists. This flexible system empowers its users to answer diverse biological questions that are often outside of the scope of common databases in a data-driven manner. This capability allows domain experts to quickly and easily generate hypotheses about biological processes, tissues or diseases of interest. Specifically PILGRM helps biologists generate these hypotheses by analyzing the expression levels of known relevant genes in large compendia of microarray data. Because PILGRM is data-driven, it complements a user’s knowledge and literature analysis with mining of diverse functional genomic data, thereby generating novel predictions that can drive experimental follow-up. This server is free, does not require registration and is available for use at http://pilgrm.princeton.edu
Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution
August 1, 2010Bisulfite sequencing measures absolute levels of DNA methylation at single-nucleotide resolution,
providing a robust platform for molecular diagnostics. Here, we optimize bisulfite sequencing for
genome-scale analysis of clinical samples. Specifically, we outline how restriction digestion
targets bisulfite sequencing to hotspots of epigenetic regulation; we show that 30ng of DNA are
sufficient for genome-scale analysis; we demonstrate that our protocol works well on formalinfixed,
paraffin-embedded (FFPE) samples; and we describe a statistical method for assessing
significance of altered DNA methylation patterns.National Institutes of Health (U.S.) (Grant R01HG004401)National Institutes of Health (U.S.) (Grant U54HG03067)National Institutes of Health (U.S.) (Grant U01ES017155
Proteome-based plasma biomarkers for Alzheimer's disease
Alzheimer's disease is a common and devastating disease for which there is no readily available biomarker to aid diagnosis or to monitor disease progression. Biomarkers have been sought in CSF but no previous study has used two-dimensional gel electrophoresis coupled with mass spectrometry to seek biomarkers in peripheral tissue. We performed a case-control study of plasma using this proteomics approach to identify proteins that differ in the disease state relative to aged controls. For discovery-phase proteomics analysis, 50 people with Alzheimer's dementia were recruited through secondary services and 50 normal elderly controls through primary care. For validation purposes a total of 511 subjects with Alzheimer's disease and other neurodegenerative diseases and normal elderly controls were examined. Image analysis of the protein distribution of the gels alone identifies disease cases with 56% sensitivity and 80% specificity. Mass spectrometric analysis of the changes observed in two-dimensional electrophoresis identified a number of proteins previously implicated in the disease pathology, including complement factor H (CFH) precursor and α-2-macroglobulin (α- 2M). Using semi-quantitative immunoblotting, the elevation of CFH and α- 2M was shown to be specific for Alzheimer's disease and to correlate with disease severity although alternative assays would be necessary to improve sensitivity and specificity. These findings suggest that blood may be a rich source for biomarkers of Alzheimer's disease and that CFH, together with other proteins such as α- 2M may be a specific markers of this illness. © 2006 The Author(s).link_to_subscribed_fulltex
A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data
<p>Abstract</p> <p>Background</p> <p>Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide application area. Several standalone applications had been developed in order to analyze microarray data. Two of the most known free analysis software packages are the R-based Bioconductor and dChip. The part of dChip software concerning the calculation and the analysis of gene expression has been modified to permit its execution on both cluster environments (supercomputers) and Grid infrastructures (distributed computing).</p> <p>This work is not aimed at replacing existing tools, but it provides researchers with a method to analyze large datasets without any hardware or software constraints.</p> <p>Results</p> <p>An application able to perform the computation and the analysis of gene expression on large datasets has been developed using algorithms provided by dChip. Different tests have been carried out in order to validate the results and to compare the performances obtained on different infrastructures. Validation tests have been performed using a small dataset related to the comparison of HUVEC (Human Umbilical Vein Endothelial Cells) and Fibroblasts, derived from same donors, treated with IFN-α.</p> <p>Moreover performance tests have been executed just to compare performances on different environments using a large dataset including about 1000 samples related to Breast Cancer patients.</p> <p>Conclusion</p> <p>A Grid-enabled software application for the analysis of large Microarray datasets has been proposed. DChip software has been ported on Linux platform and modified, using appropriate parallelization strategies, to permit its execution on both cluster environments and Grid infrastructures. The added value provided by the use of Grid technologies is the possibility to exploit both computational and data Grid infrastructures to analyze large datasets of distributed data. The software has been validated and performances on cluster and Grid environments have been compared obtaining quite good scalability results.</p
- …