2,277 research outputs found

    Gamma-based clustering via ordered means with application to gene-expression analysis

    Full text link
    Discrete mixture models provide a well-known basis for effective clustering algorithms, although technical challenges have limited their scope. In the context of gene-expression data analysis, a model is presented that mixes over a finite catalog of structures, each one representing equality and inequality constraints among latent expected values. Computations depend on the probability that independent gamma-distributed variables attain each of their possible orderings. Each ordering event is equivalent to an event in independent negative-binomial random variables, and this finding guides a dynamic-programming calculation. The structuring of mixture-model components according to constraints among latent means leads to strict concavity of the mixture log likelihood. In addition to its beneficial numerical properties, the clustering method shows promising results in an empirical study.Comment: Published in at http://dx.doi.org/10.1214/10-AOS805 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bias detection and correction in RNA-Sequencing data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High throughput sequencing technology provides us unprecedented opportunities to study transcriptome dynamics. Compared to microarray-based gene expression profiling, RNA-Seq has many advantages, such as high resolution, low background, and ability to identify novel transcripts. Moreover, for genes with multiple isoforms, expression of each isoform may be estimated from RNA-Seq data. Despite these advantages, recent work revealed that base level read counts from RNA-Seq data may not be randomly distributed and can be affected by local nucleotide composition. It was not clear though how the base level read count bias may affect gene level expression estimates.</p> <p>Results</p> <p>In this paper, by using five published RNA-Seq data sets from different biological sources and with different data preprocessing schemes, we showed that commonly used estimates of gene expression levels from RNA-Seq data, such as reads per kilobase of gene length per million reads (RPKM), are biased in terms of gene length, GC content and dinucleotide frequencies. We directly examined the biases at the gene-level, and proposed a simple generalized-additive-model based approach to correct different sources of biases simultaneously. Compared to previously proposed base level correction methods, our method reduces bias in gene-level expression estimates more effectively.</p> <p>Conclusions</p> <p>Our method identifies and corrects different sources of biases in gene-level expression measures from RNA-Seq data, and provides more accurate estimates of gene expression levels from RNA-Seq. This method should prove useful in meta-analysis of gene expression levels using different platforms or experimental protocols.</p

    Analysis of margin classification systems for assessing the risk of local recurrence after soft tissue sarcoma resection

    Get PDF
    Purpose: To compare the ability of margin classification systems to determine local recurrence (LR) risk after soft tissue sarcoma (STS) resection. Methods: Two thousand two hundred seventeen patients with nonmetastatic extremity and truncal STS treated with surgical resection and multidisciplinary consideration of perioperative radiotherapy were retrospectively reviewed. Margins were coded by residual tumor (R) classification (in which microscopic tumor at inked margin defines R1), the R+1mm classification (in which microscopic tumor within 1 mm of ink defines R1), and the Toronto Margin Context Classification (TMCC; in which positive margins are separated into planned close but positive at critical structures, positive after whoops re-excision, and inadvertent positive margins). Multivariate competing risk regression models were created. Results: By R classification, LR rates at 10-year follow-up were 8%, 21%, and 44% in R0, R1, and R2, respectively. R+1mm classification resulted in increased R1 margins (726 v 278, P &lt; .001), but led to decreased LR for R1 margins without changing R0 LR; for R0, the 10-year LR rate was 8% (range, 7% to 10%); for R1, the 10-year LR rate was 12% (10% to 15%) . The TMCC also showed various LR rates among its tiers (P &lt; .001). LR rates for positive margins on critical structures were not different from R0 at 10 years (11% v 8%, P = .18), whereas inadvertent positive margins had high LR (5-year, 28% [95% CI, 19% to 37%]; 10-year, 35% [95% CI, 25% to 46%]; P &lt; .001). Conclusion: The R classification identified three distinct risk levels for LR in STS. An R+1mm classification reduced LR differences between R1 and R0, suggesting that a negative but &lt; 1-mm margin may be adequate with multidisciplinary treatment. The TMCC provides additional stratification of positive margins that may aid in surgical planning and patient education

    Short Communication: Analysis of Minor Populations of Human Immunodeficiency Virus by Primer Identification and Insertion-Deletion and Carry Forward Correction Pipelines

    Get PDF
    Accurate analysis of minor populations of drug-resistant HIV requires analysis of a sufficient number of viral templates. We assessed the effect of experimental conditions on the analysis of HIV pol 454 pyrosequences generated from plasma using (1) the ‘‘Insertion-deletion (indel) and Carry Forward Correction’’ (ICC) pipeline, which clusters sequence reads using a nonsubstitution approach and can correct for indels and carry forward errors, and (2) the ‘‘Primer Identification (ID)’’ method, which facilitates construction of a consensus sequence to correct for sequencing errors and allelic skewing. The Primer ID and ICC methods produced similar estimates of viral diversity, but differed in the number of sequence variants generated. Sequence preparation for ICC was comparably simple, but was limited by an inability to assess the number of templates analyzed and allelic skewing. The more costly Primer ID method corrected for allelic skewing and provided the number of viral templates analyzed, which revealed that amplifiable HIV templates varied across specimens and did not correlate with clinical viral load. This latter observation highlights the value of the Primer ID method, which by determining the number of templates amplified, enables more accurate assessment of minority species in the virus population, which may be relevant to prescribing effective antiretroviral therapy

    Selection on dispersal drives evolution of metabolic capacities for energy production in female wing-polymorphic sand field crickets, Gryllus firmus

    Get PDF
    Life history and metabolism covary, but the mechanisms and individual traits responsible for these linkages remain unresolved. Dispersal capability is a critical component of life histories that is constrained by metabolic capacities for energy production. Conflicting relationships between metabolism and life histories may be explained by accounting for variation in dispersal and maximal metabolic rates. We used female wing-polymorphic sand field crickets, Gryllus firmus, selected either for long wings (LW) and flight-capability or short wings (SW) and high early lifetime fecundity to test the hypothesis that selection on dispersal capability drives the evolution of metabolic capacities. While resting metabolic rates were similar, long-winged crickets reached higher maximal metabolic rates than short-winged crickets, resulting in improved running performance. We further provided insight into the mechanisms responsible for covariation between life history and metabolism by comparing mitochondrial content of tissues involved in powering locomotion and assessing function of mitochondria isolated from long- and short-winged crickets. This demonstrated that larger metabolic capacities in long-winged crickets were underpinned by increases in mitochondrial content of dorsoventral flight muscle and enhanced bioenergetic capacities of mitochondria within the fat body, a tissue responsible for fuel storage and mobilization. Thus, selection on flight-capability remodels metabolism in a trait and tissue-specific manner to enlarge metabolic capacities necessary for dispersal

    Simplified Paper Format for Detecting HIV Drug Resistance in Clinical Specimens by Oligonucleotide Ligation

    Get PDF
    Human immunodeficiency virus (HIV) is a chronic infection that can be managed by antiretroviral treatment (ART). However, periods of suboptimal viral suppression during lifelong ART can select for HIV drug resistant (DR) variants. Transmission of drug resistant virus can lessen or abrogate ART efficacy. Therefore, testing of individuals for drug resistance prior to initiation of treatment is recommended to ensure effective ART. Sensitive and inexpensive HIV genotyping methods are needed in low-resource settings where most HIV infections occur. The oligonucleotide ligation assay (OLA) is a sensitive point mutation assay for detection of drug resistance mutations in HIV pol. The current OLA involves four main steps from sample to analysis: (1) lysis and/or nucleic acid extraction, (2) amplification of HIV RNA or DNA, (3) ligation of oligonucleotide probes designed to detect single nucleotide mutations that confer HIV drug resistance, and (4) analysis via oligonucleotide surface capture, denaturation, and detection (CDD). The relative complexity of these steps has limited its adoption in resource-limited laboratories. Here we describe a simplification of the 2.5-hour plate-format CDD to a 45-minute paper-format CDD that eliminates the need for a plate reader. Analysis of mutations at four HIV-1 DR codons (K103N, Y181C, M184V, and G190A) in 26 blood specimens showed a strong correlation of the ratios of mutant signal to total signal between the paper CDD and the plate CDD. The assay described makes the OLA easier to perform in low resource laboratories

    Emerging antiretroviral drug resistance in sub-Saharan Africa: novel affordable technologies are needed to provide resistance testing for individual and public health benefits

    Get PDF
    In industrialized countries, viral load monitoring and genotypic antiretroviral drug resistance testing (GART) play an important role in the selection of initial and subsequent combination antiretroviral therapy (cART) regimens. In contrast, resource constraints in Africa limit access to assays that could detect virologic failure, transmitted drug resistance (TDR) and acquired drug resistance to cART. This has adverse consequences for both individual and public health. Although the further roll-out of antiretrovirals for prevention, including preexposure prophylaxis (PrEP) and universal test and treat (UTT) strategies, could reduce HIV-1 incidence, these strategies may increase TDR [1,2]. Here, we present arguments that the scale up of antiretrovirals use should be accompanied by cost-effective assays for early detection of virologic failure, surveillance of TDR and GART for individual patient management

    Differential control of Zap1-regulated genes in response to zinc deficiency in Saccharomyces cerevisiae

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Zap1 transcription factor is a central player in the response of yeast to changes in zinc status. We previously used transcriptome profiling with DNA microarrays to identify 46 potential Zap1 target genes in the yeast genome. In this new study, we used complementary methods to identify additional Zap1 target genes.</p> <p>Results</p> <p>With alternative growth conditions for the microarray experiments and a more sensitive motif identification algorithm, we identified 31 new potential targets of Zap1 activation. Moreover, an analysis of the response of Zap1 target genes to a range of zinc concentrations and to zinc withdrawal over time demonstrated that these genes respond differently to zinc deficiency. Some genes are induced under mild zinc deficiency and act as a first line of defense against this stress. First-line defense genes serve to maintain zinc homeostasis by increasing zinc uptake, and by mobilizing and conserving intracellular zinc pools. Other genes respond only to severe zinc limitation and act as a second line of defense. These second-line defense genes allow cells to adapt to conditions of zinc deficiency and include genes involved in maintaining secretory pathway and cell wall function, and stress responses.</p> <p>Conclusion</p> <p>We have identified several new targets of Zap1-mediated regulation. Furthermore, our results indicate that through the differential regulation of its target genes, Zap1 prioritizes mechanisms of zinc homeostasis and adaptive responses to zinc deficiency.</p
    corecore