144 research outputs found

    rpartOrdinal: An R Package for Deriving a Classification Tree for Predicting an Ordinal Response

    Get PDF
    This paper describes an R package, rpartOrdinal, that implements alternative splitting functions for fitting a classification tree when interest lies in predicting an ordinal response. This includes the generalized Gini impurity function, which was introduced as a method for predicting an ordinal response by including costs of misclassification into the impurity function, as well as an alternative ordinal impurity function due to Piccarreta (2008) that does not require the assignment of misclassification costs. The ordered twoing splitting method, which is not defined as a decrease in node impurity, is also included in the package. Since, in the ordinal response setting, misclassifying observations to adjacent categories is a less egregious error than misclassifying observations to distant categories, this package also includes a function for estimating an ordinal measure of association, the gamma statistic

    Generalized Monotone Incremental Forward Stagewise Method for Modeling Count Data: Application Predicting Micronuclei Frequency

    Get PDF
    The cytokinesis-block micronucleus (CBMN) assay can be used to quantify micronucleus (MN) formation, the outcome measured being MN frequency. MN frequency has been shown to be both an accurate measure of chromosomal instability/DNA damage and a risk factor for cancer. Similarly, the Agilent 4×44k human oligonucleotide microarray can be used to quantify gene expression changes. Despite the existence of accepted methodologies to quantify both MN frequency and gene expression, very little is known about the association between the two. In modeling our count outcome (MN frequency) using gene expression levels from the high-throughput assay as our predictor variables, there are many more variables than observations. Hence, we extended the generalized monotone incremental forward stagewise method for predicting a count outcome for high-dimensional feature settings

    Graphical technique for identifying a monotonic variance stabilizing transformation for absolute gene intensity signals

    Get PDF
    BACKGROUND: The usefulness of log(2 )transformation for cDNA microarray data has led to its widespread application to Affymetrix data. For Affymetrix data, where absolute intensities are indicative of number of transcripts, there is a systematic relationship between variance and magnitude of measurements. Application of the log(2 )transformation expands the scale of genes with low intensities while compressing the scale of genes with higher intensities thus reversing the mean by variance relationship. The usefulness of these transformations needs to be examined. RESULTS: Using an Affymetrix GeneChip(® )dataset, problems associated with applying the log(2 )transformation to absolute intensity data are demonstrated. Use of the spread-versus-level plot to identify an appropriate variance stabilizing transformation is presented. For the data presented, the spread-versus-level plot identified a power transformation that successfully stabilized the variance of probe set summaries. CONCLUSION: The spread-versus-level plot is helpful to identify transformations for variance stabilization. This is robust against outliers and avoids assumption of models and maximizations

    Empirical validation of the S-Score algorithm in the analysis of gene expression data

    Get PDF
    BACKGROUND: Current methods of analyzing Affymetrix GeneChip(® )microarray data require the estimation of probe set expression summaries, followed by application of statistical tests to determine which genes are differentially expressed. The S-Score algorithm described by Zhang and colleagues is an alternative method that allows tests of hypotheses directly from probe level data. It is based on an error model in which the detected signal is proportional to the probe pair signal for highly expressed genes, but approaches a background level (rather than 0) for genes with low levels of expression. This model is used to calculate relative change in probe pair intensities that converts probe signals into multiple measurements with equalized errors, which are summed over a probe set to form the S-Score. Assuming no expression differences between chips, the S-Score follows a standard normal distribution, allowing direct tests of hypotheses to be made. Using spike-in and dilution datasets, we validated the S-Score method against comparisons of gene expression utilizing the more recently developed methods RMA, dChip, and MAS5. RESULTS: The S-score showed excellent sensitivity and specificity in detecting low-level gene expression changes. Rank ordering of S-Score values more accurately reflected known fold-change values compared to other algorithms. CONCLUSION: The S-score method, utilizing probe level data directly, offers significant advantages over comparisons using only probe set expression summaries

    Penalized Ordinal Regression Methods for Predicting Stage of Cancer in High-Dimensional Covariate Spaces

    Get PDF
    The pathological description of the stage of a tumor is an important clinical designation and is considered, like many other forms of biomedical data, an ordinal outcome. Currently, statistical methods for predicting an ordinal outcome using clinical, demographic, and high-dimensional correlated features are lacking. In this paper, we propose a method that fits an ordinal response model to predict an ordinal outcome for high-dimensional covariate spaces. Our method penalizes some covariates (high-throughput genomic features) without penalizing others (such as demographic and/or clinical covariates). We demonstrate the application of our method to predict the stage of breast cancer. In our model, breast cancer subtype is a nonpenalized predictor, and CpG site methylation values from the Illumina Human Methylation 450K assay are penalized predictors. The method has been made available in the ordinalgmifs package in the R programming environment

    Application of a correlation correction factor in a microarray cross-platform reproducibility study

    Get PDF
    Background Recent research examining cross-platform correlation of gene expression intensities has yielded mixed results. In this study, we demonstrate use of a correction factor for estimating cross-platform correlations. Results In this paper, three technical replicate microarrays were hybridized to each of three platforms. The three platforms were then analyzed to assess both intra- and cross-platform reproducibility. We present various methods for examining intra-platform reproducibility. We also examine cross-platform reproducibility using Pearson\u27s correlation. Additionally, we previously developed a correction factor for Pearson\u27s correlation which is applicable when X and Y are measured with error. Herein we demonstrate that correcting for measurement error by estimating the disattenuated correlation substantially improves cross-platform correlations. Conclusion When estimating cross-platform correlation, it is essential to thoroughly evaluate intra-platform reproducibility as a first step. In addition, since measurement error is present in microarray gene expression data, methods to correct for attenuation are useful in decreasing the bias in cross-platform correlation estimates

    Epigenetic Alterations and an Increased Frequency of Micronuclei in Women with Fibromyalgia

    Get PDF
    Fibromyalgia (FM), characterized by chronic widespread pain, fatigue, and cognitive/mood disturbances, leads to reduced workplace productivity and increased healthcare expenses. To determine if acquired epigenetic/genetic changes are associated with FM, we compared the frequency of spontaneously occurring micronuclei (MN) and genome-wide methylation patterns in women with FM () to those seen in comparably aged healthy controls ( (MN); (methylation)). The mean (sd) MN frequency of women with FM (51.4 (21.9)) was significantly higher than that of controls (15.8 (8.5)) (; df = 1; ). Significant differences ( sites) in methylation patterns were observed between cases and controls considering a 5% false discovery rate. The majority of differentially methylated (DM) sites (91%) were attributable to increased values in the women with FM. The DM sites included significant biological clusters involved in neuron differentiation/nervous system development, skeletal/organ system development, and chromatin compaction. Genes associated with DM sites whose function has particular relevance to FM included BDNF, NAT15, HDAC4, PRKCA, RTN1, and PRKG1. Results support the need for future research to further examine the potential role of epigenetic and acquired chromosomal alterations as a possible biological mechanism underlying FM

    Reduced Expression of Inflammatory Genes in Deceased Donor Kidneys Undergoing Pulsatile Pump Preservation

    Get PDF
    Background The use of expanded criteria donor kidneys (ECD) had been associated with worse outcomes. Whole gene expression of pre-implantation allograft biopsies from deceased donor kidneys (DDKs) was evaluated to compare the effect of pulsatile pump preservation (PPP) vs. cold storage preservation (CSP) on standard and ECD kidneys. Methodology/Principal Findings 99 pre-implantation DDK biopsies were studied using gene expression with GeneChips. Kidneys transplant recipients were followed post transplantation for 35.8 months (range = 24–62). The PPP group included 60 biopsies (cold ischemia time (CIT) = 1,367+/−509 minutes) and the CSP group included 39 biopsies (CIT = 1,022+/−485 minutes) (P Conclusions/Significance Inflammation was the most important up-regulated pattern associated with pre-implantation biopsies undergoing CSP even when the PPP group has a larger number of ECD kidneys. No significant difference was observed in delayed graft function incidence and graft function post-transplantation. These findings support the use of PPP in ECD donor kidneys
    • …
    corecore