43 research outputs found

    Nonparametric Tests for Multivariate Association

    No full text
    Testing the existence of association between a multivariate response and predictors is an important statistical problem. In this paper, we present nonparametric procedures that make no specific distributional, regression function, and covariance matrix assumptions. Our test is motivated by recent results in MANOVA tests for a large number of groups. Two types of tests are proposed. While it is natural to consider the classical approach for constructing the test by jointly considering all the variables together, we also investigate a composite test where variable-by-variable univariate tests are combined to form a multivariate test. The asymptotic distributions of the test statistics are derived in a unified manner by deriving the asymptotic matrix variate normal distribution of random matrices involved in the construction of the statistics. The tests have good numerical performance in finite samples. The application of the methods is illustrated with gene expression profiling of bronchial airway brushings

    Generalized Nonparametric Composite Tests for High-Dimensional Data

    No full text
    In this paper, composite high-dimensional nonparametric tests for two samples are proposed, by using component-wise Wilcoxon–Mann–Whitney-type statistics. No distributional assumption, moment condition, or parametric model is required for the development of the tests and the theoretical results. Two approaches are employed, for estimating the asymptotic variance of the composite statistic, leading to two tests. In both cases, banding of the covariance matrix to estimate variance of the test statistic is involved. An adaptive algorithm, for selecting the banding window width, is proposed. Numerical studies are provided, to show the favorable performance of the new tests in finite samples and under varying degrees of dependence

    Nonparametric Tests for Multivariate Association

    No full text
    Testing the existence of association between a multivariate response and predictors is an important statistical problem. In this paper, we present nonparametric procedures that make no specific distributional, regression function, and covariance matrix assumptions. Our test is motivated by recent results in MANOVA tests for a large number of groups. Two types of tests are proposed. While it is natural to consider the classical approach for constructing the test by jointly considering all the variables together, we also investigate a composite test where variable-by-variable univariate tests are combined to form a multivariate test. The asymptotic distributions of the test statistics are derived in a unified manner by deriving the asymptotic matrix variate normal distribution of random matrices involved in the construction of the statistics. The tests have good numerical performance in finite samples. The application of the methods is illustrated with gene expression profiling of bronchial airway brushings

    A Roadmap for Building Data Science Capacity for Health Discovery and Innovation in Africa

    Get PDF
    Technological advances now make it possible to generate diverse, complex and varying sizes of data in a wide range of applications from business to engineering to medicine. In the health sciences, in particular, data are being produced at an unprecedented rate across the full spectrum of scientific inquiry spanning basic biology, clinical medicine, public health and health care systems. Leveraging these data can accelerate scientific advances, health discovery and innovations. However, data are just the raw material required to generate new knowledge, not knowledge on its own, as a pile of bricks would not be mistaken for a building. In order to solve complex scientific problems, appropriate methods, tools and technologies must be integrated with domain knowledge expertise to generate and analyze big data. This integrated interdisciplinary approach is what has become to be widely known as data science. Although the discipline of data science has been rapidly evolving over the past couple of decades in resource-rich countries, the situation is bleak in resource-limited settings such as most countries in Africa primarily due to lack of well-trained data scientists. In this paper, we highlight a roadmap for building capacity in health data science in Africa to help spur health discovery and innovation, and propose a sustainable potential solution consisting of three key activities: a graduate-level training, faculty development, and stakeholder engagement. We also outline potential challenges and mitigating strategies

    Accurate Inference for Repeated Measures in High Dimensions

    Get PDF
    This paper proposes inferential methods for high-dimensional repeated measures in factorial designs. High-dimensional refers to the situation where the dimension is growing with sample size such that either one could be larger than the other. The most important contribution relates to high-accuracy of the methods in the sense that p-values, for example, are accurate up to the second-order. Second-order accuracy in sample size as well as dimension is achieved by obtaining asymptotic expansion of the distribution of the test statistics, and estimation of the parameters of the approximate distribution with second-order consistency. The methods are presented in a unified and succinct manner that it covers general factorial designs as well as any comparisons among the cell means. Expression for asymptotic powers are derived under two reasonable local alternatives. A simulation study provides evidence for a gain in accuracy and power compared to limiting distribution approximations and other competing methods for high-dimensional repeated measures analysis. The application of the methods are illustrated with a real-data from Electroencephalogram (EEG) study of alcoholic and control subjects

    A comparison of recent nonparametric methods for testing effects in two-by-two factorial designs

    No full text
    The two-way two-levels crossed factorial design is a commonly-used design by practitioners at the exploratory phase of industrial experiments. The F-test in the usual linear model for analysis of variance (ANOVA) is a key instrument to assess the impact of each factor and of their interactions on the response variable. However, if assumptions such as normal distribution and homoscedasticity of errors are violated, the conventional wisdom is to resort to nonparametric tests. Nonparametric methods, rank-based as well as permutation, have been a subject of recent investigations to make them effective in testing the hypotheses of interest and to improve their performance in small sample situations. In this study, we assess the performances of some nonparametric methods and, more importantly, we compare their powers. Specifically, we examine three permutation methods (Constrained Synchronized Permutations, Unconstrained Synchronized Permutations and Wald-Type Permutation Test), a rank-based method (Aligned Rank Transform) and a parametric method (ANOVA-Type Test). In the simulations, we generate datasets with different configurations of distribution of errors, variance, factor's effect and number of replicates. The objective is to elicit practical advice and guides to practitioners regarding the sensitivity of the tests in the various configurations, the conditions under which some tests cannot be used, the tradeoff between power and type I error, and the bias of the power on one main factor analysis due to presence of effect of the other factor. A dataset from an industrial engineering experiment for thermoformed packaging production is used to illustrate the application of the various methods of analysis, taking into account the power of the test suggested by the objective of the experiment

    The nonparametric Behrens‐Fisher problem with dependent replicates

    Get PDF
    Purely nonparametric methods are developed for general two-sample problems in which each experimental unit may have an individual number of possibly correlated replicates. In particular, equality of the variances, or higher moments, of the distributions of the data is not assumed, even under the null hypothesis of no treatment effect. Thus, a solution for the so-called nonparametric Behrens-Fisher problem is proposed for such models. The methods are valid for metric, count, ordered categorical, and even dichotomous data in a unified way. Point estimators of the treatment effects as well as their asymptotic distributions will be studied in detail. For small sample sizes, the distributions of the proposed test statistics are approximated using Satterthwaite-Welch-type t-approximations. Extensive simulation studies show favorable performance of the new methods, in particular, in small sample size situations. A real data set illustrates the application of the proposed methods

    Nonparametric Inference for Multivariate Data: The R Package npmv

    Get PDF
    We introduce the R package npmv that performs nonparametric inference for the comparison of multivariate data samples and provides the results in easy-to-understand, but statistically correct, language. Unlike in classical multivariate analysis of variance, multivariate normality is not required for the data. In fact, the different response variables may even be measured on different scales (binary, ordinal, quantitative). p values are calculated for overall tests (permutation tests and F approximations), and, using multiple testing algorithms which control the familywise error rate, significant subsets of response variables and factor levels are identified. The package may be used for low- or highdimensional data with small or with large sample sizes and many or few factor levels

    High-Dimensional Repeated Measures

    Get PDF
    Recently, new tests for main and simple treatment effects, time effects, and treatment by time interactions in possibly high-dimensional multigroup repeated-measures designs with unequal covariance matrices have been proposed. Technical details for using more than one between-subject and more than one within-subject factor are presented in this article. Furthermore, application to electroencephalography (EEG) data of a neurological study with two whole-plot factors (diagnosis and sex) and two subplot factors (variable and region) is shown with the R package HRM (high-dimensional repeated measures)

    Randomized Trial of Interventions to Improve Childhood Asthma in Homes with Wood-Burning Stoves

    Get PDF
    BACKGROUND: Household air pollution due to biomass combustion for residential heating adversely affects vulnerable populations. Randomized controlled trials to improve indoor air quality in homes of children with asthma are limited, and no such studies have been conducted in homes using wood for heating. OBJECTIVES: Our aims were to test the hypothesis that household-level interventions, specifically improved-technology wood-burning appliances or air-filtration devices, would improve health measures, in particular Pediatric Asthma Quality of Life Questionnaire (PAQLQ) scores, relative to placebo, among children living with asthma in homes with wood-burning stoves. METHODS: A three-arm placebo-controlled randomized trial was conducted in homes with wood-burning stoves among children with asthma. Multiple preintervention and postintervention data included PAQLQ (primary outcome), peak expiratory flow (PEF) monitoring, diurnal peak flow variability (dPFV, an indicator of airway hyperreactivity) and indoor particulate matter (PM) PM2.5. RESULTS: Relative to placebo, neither the air filter nor the woodstove intervention showed improvement in quality-of-life measures. Among the secondary outcomes, dPFV showed a 4.1 percentage point decrease in variability [95% confidence interval (CI) = −7.8 to −0.4] for air-filtration use in comparison with placebo. The air-filter intervention showed a 67% (95% CI: 50% to 77%) reduction in indoor PM2.5, but no change was observed with the improved-technology woodstove intervention. CONCLUSIONS: Among children with asthma and chronic exposure to woodsmoke, an air-filter intervention that improved indoor air quality did not affect quality-of-life measures. Intent-to-treat analysis did show an improvement in the secondary measure of dPFV
    corecore