14 research outputs found

    P-values for classification

    Get PDF
    Let (X,Y)(X,Y) be a random variable consisting of an observed feature vector XXX\in \mathcal{X} and an unobserved class label Y{1,2,...,L}Y\in \{1,2,...,L\} with unknown joint distribution. In addition, let D\mathcal{D} be a training data set consisting of nn completely observed independent copies of (X,Y)(X,Y). Usual classification procedures provide point predictors (classifiers) Y^(X,D)\widehat{Y}(X,\mathcal{D}) of YY or estimate the conditional distribution of YY given XX. In order to quantify the certainty of classifying XX we propose to construct for each θ=1,2,...,L\theta =1,2,...,L a p-value πθ(X,D)\pi_{\theta}(X,\mathcal{D}) for the null hypothesis that Y=θY=\theta, treating YY temporarily as a fixed parameter. In other words, the point predictor Y^(X,D)\widehat{Y}(X,\mathcal{D}) is replaced with a prediction region for YY with a certain confidence. We argue that (i) this approach is advantageous over traditional approaches and (ii) any reasonable classifier can be modified to yield nonparametric p-values. We discuss issues such as optimality, single use and multiple use validity, as well as computational and graphical aspects.Comment: Published in at http://dx.doi.org/10.1214/08-EJS245 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Genetic association studies for gene expressions: permutation-based mutual information in a comparison with standard ANOVA and as a novel approach for feature selection

    Get PDF
    Mutual information (MI) is a robust nonparametric statistical approach for identifying associations between genotypes and gene expression levels. Using the data of Problem 1 provided for the Genetic Analysis Workshop 15, we first compared a quantitative MI (Tsalenko et al. 2006 J Bioinform Comput Biol 4:259–4) with the standard analysis of variance (ANOVA) and the nonparametric Kruskal-Wallis (KW) test. We then proposed a novel feature selection approach using MI in a classification scenario to address the small n - large p problem and compared it with a feature selection that relies on an asymptotic χ2 distribution. In both applications, we used a permutation-based approach for evaluating the significance of MI. Substantial discrepancies in significance were observed between MI, ANOVA, and KW that can be explained by different empirical distributions of the data. In contrast to ANOVA and KW, MI detects shifts in location when the data are non-normally distributed, skewed, or contaminated with outliers. ANOVA but not MI is often significant if one genotype with a small frequency had a remarkable difference in the average gene expression level relative to the other two genotypes. MI depends on genotype frequencies and cannot detect these differences. In the classification scenario, we show that our novel approach for feature selection identifies a smaller list of markers with higher accuracy compared to the standard method. In conclusion, permutation-based MI approaches provide reliable and flexible statistical frameworks which seem to be well suited for data that are non-normal, skewed, or have an otherwise peculiar distribution. They merit further methodological investigation

    Evaluation of high-sensitivity C-reactive protein and uric acid in vericiguat-treated patients with heart failure with reduced ejection fraction

    Get PDF
    Aims: The effects of vericiguat vs. placebo on high-sensitivity C-reactive protein (hsCRP) and serum uric acid (SUA) were assessed in patients with heart failure with reduced ejection fraction (HFrEF) in the Phase 2 SOCRATES-REDUCED study (NCT01951625). Methods and results: Changes from baseline hsCRP and SUA values at 12 weeks with placebo and vericiguat (1.25 mg, 2.5 mg, 5.0 mg and 10.0 mg, respectively) were assessed. The probability of achieving an hsCRP value of ≤3.0 mg/L or SUA value of <7.0 mg/dL at week 12 was tested. Median baseline hsCRP and SUA levels were 3.68 mg/L [interquartile range (IQR) 1.41–8.41; n = 335] and 7.80 mg/dL (IQR 6.40–9.33; n = 348), respectively. Baseline-adjusted mean percentage changes in hsCRP were 0.2%, −19.5%, −24.3%, −25.7% and −31.9% in the placebo and vericiguat 1.25 mg, 2.5 mg, 5.0 mg and 10.0 mg groups, respectively; significance vs. placebo was observed in the vericiguat 10.0 mg group (P = 0.035). Baseline-adjusted mean percentage changes in SUA were 5.0%, −1.3%, −1.1%, −3.5% and −5.3% in the placebo, and vericiguat 1.25 mg, 2.5 mg, 5.0 mg and 10.0 mg groups, respectively; significance vs. placebo was observed in the 5.0 mg and 10.0 mg groups (P = 0.0202 and P = 0.004, respectively). Estimated probability for an end-of-treatment hsCRP value of ≤3.0 mg/L and SUA value of <7.0 mg/dL was higher with vericiguat compared with placebo. The effect was dose-dependent, with the greatest effect observed in the 10.0 mg group. Conclusions: Vericiguat treatment for 12 weeks was associated with reductions in hsCRP and SUA, and a higher likelihood of achieving an hsCRP value of ≤3.0 mg/L and SUA value of <7.0 mg/dL

    In vivo alkaline comet assay: Statistical considerations on historical negative and positive control data

    Get PDF
    The alkaline comet assay is frequently used as in vivo follow-up test within different regulatory environments to characterize the DNA-damaging potential of different test items. The corresponding OECD Test guideline 489 highlights the importance of statistical analyses and historical control data (HCD) but does not provide detailed procedures. Therefore, the working group “Statistics” of the German-speaking Society for Environmental Mutation Research (GUM) collected HCD from five laboratories and >200 comet assay studies and performed several statistical analyses. Key results included that (I) observed large inter-laboratory effects argue against the use of absolute quality thresholds, (II) > 50% zero values on a slide are considered problematic, due to their influence on slide or animal summary statistics, (III) the type of summarizing measure for single-cell data (e.g., median, arithmetic and geometric mean) may lead to extreme differences in resulting animal tail intensities and study outcome in the HCD. These summarizing values increase the reliability of analysis results by better meeting statistical model assumptions, but at the cost of information loss. Furthermore, the relation between negative and positive control groups in the data set was always satisfactorily (or sufficiently) based on ratio, difference and quantile analyses

    Genome-wide association identifies nine common variants associated with fasting proinsulin levels and provides new insights into the pathophysiology of type 2 diabetes.

    Get PDF
    OBJECTIVE: Proinsulin is a precursor of mature insulin and C-peptide. Higher circulating proinsulin levels are associated with impaired β-cell function, raised glucose levels, insulin resistance, and type 2 diabetes (T2D). Studies of the insulin processing pathway could provide new insights about T2D pathophysiology. RESEARCH DESIGN AND METHODS: We have conducted a meta-analysis of genome-wide association tests of ∼2.5 million genotyped or imputed single nucleotide polymorphisms (SNPs) and fasting proinsulin levels in 10,701 nondiabetic adults of European ancestry, with follow-up of 23 loci in up to 16,378 individuals, using additive genetic models adjusted for age, sex, fasting insulin, and study-specific covariates. RESULTS: Nine SNPs at eight loci were associated with proinsulin levels (P < 5 × 10(-8)). Two loci (LARP6 and SGSM2) have not been previously related to metabolic traits, one (MADD) has been associated with fasting glucose, one (PCSK1) has been implicated in obesity, and four (TCF7L2, SLC30A8, VPS13C/C2CD4A/B, and ARAP1, formerly CENTD2) increase T2D risk. The proinsulin-raising allele of ARAP1 was associated with a lower fasting glucose (P = 1.7 × 10(-4)), improved β-cell function (P = 1.1 × 10(-5)), and lower risk of T2D (odds ratio 0.88; P = 7.8 × 10(-6)). Notably, PCSK1 encodes the protein prohormone convertase 1/3, the first enzyme in the insulin processing pathway. A genotype score composed of the nine proinsulin-raising alleles was not associated with coronary disease in two large case-control datasets. CONCLUSIONS: We have identified nine genetic variants associated with fasting proinsulin. Our findings illuminate the biology underlying glucose homeostasis and T2D development in humans and argue against a direct role of proinsulin in coronary artery disease pathogenesis

    A dynamic model of circadian rhythms in rodent tail skin temperature for comparison of drug effects

    No full text
    <p>Abstract</p> <p>Menopause-associated thermoregulatory dysfunction can lead to symptoms such as hot flushes severely impairing quality of life of affected women. Treatment effects are often assessed by the ovariectomized rat model providing time series of tail skin temperature measurements in which circadian rhythms are a fundamental ingredient. In this work, a new statistical strategy is presented for analyzing such stochastic-dynamic data with the aim of detecting successful drugs in hot flush treatment. The circadian component is represented by a nonlinear dynamical system which is defined by the van der Pol equation and provides well-interpretable model parameters. Results regarding the statistical evaluation of these parameters are presented.</p

    Evaluation of high‐sensitivity C‐reactive protein and uric acid in vericiguat‐treated patients with heart failure with reduced ejection fraction

    No full text
    Aims: The effects of vericiguat vs. placebo on high-sensitivity C-reactive protein (hsCRP) and serum uric acid (SUA) were assessed in patients with heart failure with reduced ejection fraction (HFrEF) in the Phase 2 SOCRATES-REDUCED study (NCT01951625). Methods and results: Changes from baseline hsCRP and SUA values at 12 weeks with placebo and vericiguat (1.25 mg, 2.5 mg, 5.0 mg and 10.0 mg, respectively) were assessed. The probability of achieving an hsCRP value of ≤3.0 mg/L or SUA value of <7.0 mg/dL at week 12 was tested. Median baseline hsCRP and SUA levels were 3.68 mg/L [interquartile range (IQR) 1.41–8.41; n = 335] and 7.80 mg/dL (IQR 6.40–9.33; n = 348), respectively. Baseline-adjusted mean percentage changes in hsCRP were 0.2%, −19.5%, −24.3%, −25.7% and −31.9% in the placebo and vericiguat 1.25 mg, 2.5 mg, 5.0 mg and 10.0 mg groups, respectively; significance vs. placebo was observed in the vericiguat 10.0 mg group (P = 0.035). Baseline-adjusted mean percentage changes in SUA were 5.0%, −1.3%, −1.1%, −3.5% and −5.3% in the placebo, and vericiguat 1.25 mg, 2.5 mg, 5.0 mg and 10.0 mg groups, respectively; significance vs. placebo was observed in the 5.0 mg and 10.0 mg groups (P = 0.0202 and P = 0.004, respectively). Estimated probability for an end-of-treatment hsCRP value of ≤3.0 mg/L and SUA value of <7.0 mg/dL was higher with vericiguat compared with placebo. The effect was dose-dependent, with the greatest effect observed in the 10.0 mg group. Conclusions: Vericiguat treatment for 12 weeks was associated with reductions in hsCRP and SUA, and a higher likelihood of achieving an hsCRP value of ≤3.0 mg/L and SUA value of <7.0 mg/dL

    Genetic association studies for gene expressions: permutation-based mutual information in a comparison with standard ANOVA and as a novel approach for feature selection-3

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Genetic association studies for gene expressions: permutation-based mutual information in a comparison with standard ANOVA and as a novel approach for feature selection"</p><p>http://www.biomedcentral.com/1753-6561/1/S1/S9</p><p>BMC Proceedings 2007;1(Suppl 1):S9-S9.</p><p>Published online 18 Dec 2007</p><p>PMCID:PMC2359872.</p><p></p
    corecore