180 research outputs found
Differential expression analysis for sequence count data
*Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq
A Fast Algorithm for Robust Regression with Penalised Trimmed Squares
The presence of groups containing high leverage outliers makes linear
regression a difficult problem due to the masking effect. The available high
breakdown estimators based on Least Trimmed Squares often do not succeed in
detecting masked high leverage outliers in finite samples.
An alternative to the LTS estimator, called Penalised Trimmed Squares (PTS)
estimator, was introduced by the authors in \cite{ZiouAv:05,ZiAvPi:07} and it
appears to be less sensitive to the masking problem. This estimator is defined
by a Quadratic Mixed Integer Programming (QMIP) problem, where in the objective
function a penalty cost for each observation is included which serves as an
upper bound on the residual error for any feasible regression line. Since the
PTS does not require presetting the number of outliers to delete from the data
set, it has better efficiency with respect to other estimators. However, due to
the high computational complexity of the resulting QMIP problem, exact
solutions for moderately large regression problems is infeasible.
In this paper we further establish the theoretical properties of the PTS
estimator, such as high breakdown and efficiency, and propose an approximate
algorithm called Fast-PTS to compute the PTS estimator for large data sets
efficiently. Extensive computational experiments on sets of benchmark instances
with varying degrees of outlier contamination, indicate that the proposed
algorithm performs well in identifying groups of high leverage outliers in
reasonable computational time.Comment: 27 page
Recommended from our members
Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset
Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics
Prognosis of the individual course of disease - steps in developing a decision support tool for Multiple Sclerosis
<p>Abstract</p> <p>Background</p> <p>Multiple sclerosis is a chronic disease of uncertain aetiology. Variations in its disease course make it difficult to impossible to accurately determine the prognosis of individual patients. The Sylvia Lawry Centre for Multiple Sclerosis Research (SLCMSR) developed an "online analytical processing (OLAP)" tool that takes advantage of extant clinical trials data and allows one to model the near term future course of this chronic disease for an individual patient.</p> <p>Results</p> <p>For a given patient the most similar patients of the SLCMSR database are intelligently selected by a model-based matching algorithm integrated into an OLAP-tool to enable real time, web-based statistical analyses. The underlying database (last update April 2005) contains 1,059 patients derived from 30 placebo arms of controlled clinical trials. Demographic information on the entire database and the portion selected for comparison are displayed. The result of the statistical comparison is provided as a display of the course of Expanded Disability Status Scale (EDSS) for individuals in the database with regions of probable progression over time, along with their mean relapse rate. Kaplan-Meier curves for time to sustained progression in the EDSS and time to requirement of constant assistance to walk (EDSS 6) are also displayed. The software-application OLAP anticipates the input MS patient's course on the basis of baseline values and the known course of disease for similar patients who have been followed in clinical trials.</p> <p>Conclusion</p> <p>This simulation could be useful for physicians, researchers and other professionals who counsel patients on therapeutic options. The application can be modified for studying the natural history of other chronic diseases, if and when similar datasets on which the OLAP operates exist.</p
Genome-wide DNA methylation analysis for diabetic nephropathy in type 1 diabetes mellitus
BACKGROUND: Diabetic nephropathy is a serious complication of diabetes mellitus and is associated with considerable morbidity and high mortality. There is increasing evidence to suggest that dysregulation of the epigenome is involved in diabetic nephropathy. We assessed whether epigenetic modification of DNA methylation is associated with diabetic nephropathy in a case-control study of 192 Irish patients with type 1 diabetes mellitus (T1D). Cases had T1D and nephropathy whereas controls had T1D but no evidence of renal disease. METHODS: We performed DNA methylation profiling in bisulphite converted DNA from cases and controls using the recently developed Illumina Infinium(R) HumanMethylation27 BeadChip, that enables the direct investigation of 27,578 individual cytosines at CpG loci throughout the genome, which are focused on the promoter regions of 14,495 genes. RESULTS: Singular Value Decomposition (SVD) analysis indicated that significant components of DNA methylation variation correlated with patient age, time to onset of diabetic nephropathy, and sex. Adjusting for confounding factors using multivariate Cox-regression analyses, and with a false discovery rate (FDR) of 0.05, we observed 19 CpG sites that demonstrated correlations with time to development of diabetic nephropathy. Of note, this included one CpG site located 18 bp upstream of the transcription start site of UNC13B, a gene in which the first intronic SNP rs13293564 has recently been reported to be associated with diabetic nephropathy. CONCLUSION: This high throughput platform was able to successfully interrogate the methylation state of individual cytosines and identified 19 prospective CpG sites associated with risk of diabetic nephropathy. These differences in DNA methylation are worthy of further follow-up in replication studies using larger cohorts of diabetic patients with and without nephropathy
Integrated genomics and proteomics define huntingtin CAG length-dependent networks in mice.
To gain insight into how mutant huntingtin (mHtt) CAG repeat length modifies Huntington's disease (HD) pathogenesis, we profiled mRNA in over 600 brain and peripheral tissue samples from HD knock-in mice with increasing CAG repeat lengths. We found repeat length-dependent transcriptional signatures to be prominent in the striatum, less so in cortex, and minimal in the liver. Coexpression network analyses revealed 13 striatal and 5 cortical modules that correlated highly with CAG length and age, and that were preserved in HD models and sometimes in patients. Top striatal modules implicated mHtt CAG length and age in graded impairment in the expression of identity genes for striatal medium spiny neurons and in dysregulation of cyclic AMP signaling, cell death and protocadherin genes. We used proteomics to confirm 790 genes and 5 striatal modules with CAG length-dependent dysregulation at the protein level, and validated 22 striatal module genes as modifiers of mHtt toxicities in vivo
Increased searching and handling effort in tall swards lead to a Type IV functional response in small grazing herbivores
Understanding the functional response of species is important in comprehending the species’ population dynamics and the functioning of multi-species assemblages. A Type II functional response, where instantaneous intake rate increases asymptotically with sward biomass, is thought to be common in grazers. However, at tall, dense swards, food intake might decline due to mechanical limitations or if animals selectively forage on the most nutritious parts of a sward, leading to a Type IV functional response, especially for smaller herbivores. We tested the predictions that bite mass, cropping time, swallowing time and searching time increase, and bite rate decreases with increasing grass biomass for different-sized Canada geese (Branta canadensis) foraging on grass swards. Bite mass indeed showed an increasing asymptotic relationship with grass biomass. At high biomass, difficulties in handling long leaves and in locating bites were responsible for increasing cropping, swallowing, and searching times. Constant bite mass and decreasing bite rate caused the intake rate to decrease at high sward biomass after reaching an optimum, leading to a Type IV functional response. Grazer body mass affected maximum bite mass and intake rate, but did not change the shape of the functional response. As grass nutrient contents are usually highest in short swards, this Type IV functional response in geese leads to an intake rate that is maximised in these swards. The lower grass biomass at which intake rate was maximised allows resource partitioning between different-sized grazers. We argue that this Type IV functional response is of more importance than previously thought
Retinal Pathology of Pediatric Cerebral Malaria in Malawi
Introduction
The causes of coma and death in cerebral malaria remain unknown. Malarial retinopathy has been identified as an important clinical sign in the diagnosis and prognosis of cerebral malaria. As part of a larger autopsy study to determine causes of death in children with coma presenting to hospital in Blantyre, Malawi, who were fully evaluated clinically prior to death, we examined the histopathology of eyes of patients who died and underwent autopsy.
Methodology/Principal Findings
Children with coma were admitted to the pediatric research ward, classified according to clinical definitions as having cerebral malaria or another cause of coma, evaluated and treated. The eyes were examined by direct and indirect ophthalmoscopy. If a child died and permission was given, a standardized autopsy was carried out. The patient was then assigned an actual cause of death according to the autopsy findings. The eyes were examined pathologically for hemorrhages, cystoid macular edema, parasite sequestration and thrombi. They were stained immunohistochemically for fibrin and CD61 to identify the components of thrombi, β-amyloid precursor protein to detect axonal damage, for fibrinogen to identify vascular leakage and for glial fibrillary acidic protein to detect gliosis. Sixty-four eyes from 64 patients were examined: 35 with cerebral malaria and 29 with comas of other causes. Cerebral malaria was distinguished by sequestration of parasitized erythrocytes, the presence and severity of retinal hemorrhages, the presence of cystoid macular edema, the occurrence and number of fibrin-platelet thrombi, the presence and amount of axonal damage and vascular leakage.
Conclusions/Significance
We found significant differences in retinal histopathology between patients who died of cerebral malaria and those with other diagnoses. These histopathological findings offer insights into the etiology of malarial retinopathy and provide a pathological basis for recently described retinal capillary non-perfusion in children with malarial retinopathy. Because of the similarities between the retina and the brain it also suggests mechanisms that may contribute to coma and death in cerebral malaria
Genome-Wide Mutagenesis Reveals That ORF7 Is a Novel VZV Skin-Tropic Factor
The Varicella Zoster Virus (VZV) is a ubiquitous human alpha-herpesvirus that is the causative agent of chicken pox and shingles. Although an attenuated VZV vaccine (v-Oka) has been widely used in children in the United States, chicken pox outbreaks are still seen, and the shingles vaccine only reduces the risk of shingles by 50%. Therefore, VZV still remains an important public health concern. Knowledge of VZV replication and pathogenesis remains limited due to its highly cell-associated nature in cultured cells, the difficulty of generating recombinant viruses, and VZV's almost exclusive tropism for human cells and tissues. In order to circumvent these hurdles, we cloned the entire VZV (p-Oka) genome into a bacterial artificial chromosome that included a dual-reporter system (GFP and luciferase reporter genes). We used PCR-based mutagenesis and the homologous recombination system in the E. coli to individually delete each of the genome's 70 unique ORFs. The collection of viral mutants obtained was systematically examined both in MeWo cells and in cultured human fetal skin organ samples. We use our genome-wide deletion library to provide novel functional annotations to 51% of the VZV proteome. We found 44 out of 70 VZV ORFs to be essential for viral replication. Among the 26 non-essential ORF deletion mutants, eight have discernable growth defects in MeWo. Interestingly, four ORFs were found to be required for viral replication in skin organ cultures, but not in MeWo cells, suggesting their potential roles as skin tropism factors. One of the genes (ORF7) has never been described as a skin tropic factor. The global profiling of the VZV genome gives further insights into the replication and pathogenesis of this virus, which can lead to improved prevention and therapy of chicken pox and shingles
- …