708 research outputs found
Differential expression analysis for sequence count data
*Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq
Recommended from our members
Investigating the utility of combining Phi 29 whole genome amplification and highly multiplexed single nucleotide polymorphism BeadArray (TM) genotyping
Background: Sustainable DNA resources and reliable high-throughput genotyping methods are required for large-scale, long-term genetic association studies. In the genetic dissection of common disease it is now recognised that thousands of samples and hundreds of thousands of markers, mostly single nucleotide polymorphisms (SNPs), will have to be analysed. In order to achieve these aims, both an ability to boost quantities of archived DNA and to genotype at low costs are highly desirable. We have investigated Phi29 polymerase Multiple Displacement Amplification (MDA)-generated DNA product (MDA product), in combination with highly multiplexed BeadArray(TM) genotyping technology. As part of a large-scale BeadArray genotyping experiment we made a direct comparison of genotyping data generated from MDA product with that from genomic DNA (gDNA) templates. Results: Eighty-six MDA product and the corresponding 86 gDNA samples were genotyped at 345 SNPs and a concordance rate of 98.8% was achieved. The BeadArray sample exclusion rate, blind to sample type, was 10.5% for MDA product compared to 5.8% for gDNA. Conclusions: We conclude that the BeadArray technology successfully produces high quality genotyping data from MDA product. The combination of these technologies improves the feasibility and efficiency of mapping common disease susceptibility genes despite limited stocks of gDNA samples
Quality of treatment plans and accuracy of in vivo portal dosimetry in hybrid intensity-modulated radiation therapy and volumetric modulated arc therapy for prostate cancer.
Background and purpose Delivering selected parts of volumetric modulated arc therapy (VMAT) plans using step-and-shoot intensity modulated radiotherapy (IMRT) beams has the potential to increase plan quality by allowing specific aperture positioning. This study investigates the quality of treatment plans and the accuracy of in vivo portal dosimetry in such a hybrid approach for the case of prostate radiotherapy.Material and methods Conformal and limited-modulation VMAT plans were produced, together with five hybrid IMRT/VMAT plans, in which 0%, 25%, 50%, 75% or 100% of the segments were sequenced for IMRT, while the remainder were sequenced for VMAT. Integrated portal images were predicted for the plans. The plans were then delivered as a single hybrid beam using an Elekta Synergy accelerator with Agility head to a water-equivalent phantom and treatment time, isocentric dose and portal images were measured.Results Increasing the IMRT percentage improves dose uniformity to the planning target volume (p<0.01 for 50% IMRT or more), substantially reduces the volume of rectum irradiated to 65Gy (p=0.02 for 25% IMRT) and increases the monitor units (p<0.001). Delivery time also increases substantially. All plans show accurate delivery of dose and reliable prediction of portal images.Conclusions Hybrid IMRT/VMAT can be efficiently planned and delivered as a single beam sequence. Beyond 25% IMRT, the delivery time becomes unacceptably long, with increased risk of intrafraction motion, but 25% IMRT is an attractive compromise. Integrated portal images can be used to perform in vivo dosimetry for this technique
SST dynamics at different scales: evaluating the oceanographic model resolution skill to represent SST processes in the Southern Ocean
In this study we demonstrate the many strengths of scale analysis: we use it to evaluate the Nucleus for European Modelling of the Ocean (NEMO) model skill in representing sea surface temperature (SST) in the Southern Ocean (SO) by comparing three model resolutions: 1/12°, 1/4° and 1°. We show that whilst 4‐5 times resolution scale is sufficient for each model resolution to reproduce the magnitude of satellite Earth Observation (EO) SST spatial variability to within ±10%, the representation of ∼ 100 km SST variability patterns is substantially (e.g ∼50% at 750 km) improved by increasing model resolution from 1° to 1/12°. We also analysed the dominant scales of the SST model input drivers (short‐wave radiation, air‐sea heat fluxes, wind stress components, wind stress curl, bathymetry) variability with the purpose of determining the optimal SST model input driver resolution. The SST magnitude of variability is shown to scale with two power law regimes separated by a scaling break at ∼200 km scale. The analysis of the spatial and temporal scales of dominant SST driver impact helps to interpret this scaling break as a separation between two different dynamical regimes: the (relatively) fast SST dynamics below ∼200 km governed by eddies, fronts, Ekman upwelling and air‐sea heat exchange, whilst above ∼200 km the SST variability is dominated by long‐term (seasonal and supra‐seasonal) modes and the SST geography
Data analysis issues for allele-specific expression using Illumina's GoldenGate assay.
BACKGROUND: High-throughput measurement of allele-specific expression (ASE) is a relatively new and exciting application area for array-based technologies. In this paper, we explore several data sets which make use of Illumina's GoldenGate BeadArray technology to measure ASE. This platform exploits coding SNPs to obtain relative expression measurements for alleles at approximately 1500 positions in the genome. RESULTS: We analyze data from a mixture experiment where genomic DNA samples from pairs of individuals of known genotypes are pooled to create allelic imbalances at varying levels for the majority of SNPs on the array. We observe that GoldenGate has less sensitivity at detecting subtle allelic imbalances (around 1.3 fold) compared to extreme imbalances, and note the benefit of applying local background correction to the data. Analysis of data from a dye-swap control experiment allowed us to quantify dye-bias, which can be reduced considerably by careful normalization. The need to filter the data before carrying out further downstream analysis to remove non-responding probes, which show either weak, or non-specific signal for each allele, was also demonstrated. Throughout this paper, we find that a linear model analysis of the data from each SNP is a flexible modelling strategy that allows for testing of allelic imbalances in each sample when replicate hybridizations are available. CONCLUSIONS: Our analysis shows that local background correction carried out by Illumina's software, together with quantile normalization of the red and green channels within each array, provides optimal performance in terms of false positive rates. In addition, we strongly encourage intensity-based filtering to remove SNPs which only measure non-specific signal. We anticipate that a similar analysis strategy will prove useful when quantifying ASE on Illumina's higher density Infinium BeadChips.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Lymphocyte subsets and the role of Th1/Th2 balance in stressed chronic pain patients
Background: The complex regional pain syndrome (CRPS) and fibromyalgia (FM) are chronic pain syndromes occurring in highly stressed individuals. Despite the known connection between the nervous system and immune cells, information on distribution of lymphocyte subsets under stress and pain conditions is limited. Methods: We performed a comparative study in 15 patients with CRPS type I, 22 patients with FM and 37 age- and sex-matched healthy controls and investigated the influence of pain and stress on lymphocyte number, subpopulations and the Th1/Th2 cytokine ratio in T lymphocytes. Results: Lymphocyte numbers did not differ between groups. Quantitative analyses of lymphocyte subpopulations showed a significant reduction of cytotoxic CD8+ lymphocytes in both CRPS (p < 0.01) and FM (p < 0.05) patients as compared with healthy controls. Additionally, CRPS patients were characterized by a lower percentage of IL-2-producing T cell subpopulations reflecting a diminished Th1 response in contrast to no changes in the Th2 cytokine profile. Conclusions: Future studies are warranted to answer whether such immunological changes play a pathogenetic role in CRPS and FM or merely reflect the consequences of a pain-induced neurohumoral stress response, and whether they contribute to immunosuppression in stressed chronic pain patients. Copyright (c) 2008 S. Karger AG, Basel
Epidemiology of methicillin-resistant Staphylococcus aureus (MRSA) in Sweden 2000–2003, increasing incidence and regional differences
BACKGROUND: The occurrence of methicillin-resistant Staphylococcus aureus (MRSA) has gradually become more frequent in most countries of the world. Sweden has remained one of few exceptions to the high occurrence of MRSA in many other countries. During the late 1990s, Sweden experienced a large health-care associated outbreak which with resolute efforts was overcome. Subsequently, MRSA was made a notifiable diagnosis in Sweden in 2000. METHODS: From the start of being a notifiable disease in January 2000, the Swedish Institute for Infectious Disease Control (SMI) initiated an active surveillance of MRSA. RESULTS: The number of reported MRSA-cases in Sweden increased from 325 cases in 2000 to 544 in 2003, corresponding to an overall increase in incidence from 3.7 to 6.1 per 100000 inhabitants. Twenty five per cent of the cases were infected abroad. The domestic cases were predominantly found through cultures taken on clinical indication and the cases infected abroad through screening. There were considerable regional differences in MRSA-incidence and age-distribution of cases. CONCLUSION: The MRSA incidence in Sweden increased over the years 2000–2003. Sweden now poises on the rim of the same development that was seen in the United Kingdom some ten years ago. A quarter of the cases were infected abroad, reflecting that international transmission is now increasingly important in a low-endemic setting. To remain in this favourable situation, stepped up measures will be needed, to identify imported cases, to control domestic outbreaks and to prevent transmission within the health-care sector
Recommended from our members
Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset
Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics
Algebraic Comparison of Partial Lists in Bioinformatics
The outcome of a functional genomics pipeline is usually a partial list of
genomic features, ranked by their relevance in modelling biological phenotype
in terms of a classification or regression model. Due to resampling protocols
or just within a meta-analysis comparison, instead of one list it is often the
case that sets of alternative feature lists (possibly of different lengths) are
obtained. Here we introduce a method, based on the algebraic theory of
symmetric groups, for studying the variability between lists ("list stability")
in the case of lists of unequal length. We provide algorithms evaluating
stability for lists embedded in the full feature set or just limited to the
features occurring in the partial lists. The method is demonstrated first on
synthetic data in a gene filtering task and then for finding gene profiles on a
recent prostate cancer dataset
Making Informed Choices about Microarray Data Analysis
This article describes the typical stages in the analysis of microarray data for non-specialist researchers in systems biology and medicine. Particular attention is paid to significant data analysis issues that are commonly encountered among practitioners, some of which need wider airing. The issues addressed include experimental design, quality assessment, normalization, and summarization of multiple-probe data. This article is based on the ISMB 2008 tutorial on microarray data analysis. An expanded version of the material in this article and the slides from the tutorial can be found at http://www.people.vcu.edu/~mreimers/OGMDA/index.html
- …