161 research outputs found

    Exponentiality Test Using a Modified Lilliefors Test

    Get PDF

    Power Comparison of Some Goodness-of-fit Tests

    Get PDF
    There are some existing commonly used goodness-of-fit tests, such as the Kolmogorov-Smirnov test, the Cramer-Von Mises test, and the Anderson-Darling test. In addition, a new goodness-of-fit test named G test was proposed by Chen and Ye (2009). The purpose of this thesis is to compare the performance of some goodness-of-fit tests by comparing their power. A goodness-of-fit test is usually used when judging whether or not the underlying population distribution differs from a specific distribution. This research focus on testing whether the underlying population distribution is an exponential distribution. To conduct statistical simulation, SAS/IML is used in this research. Some alternative distributions such as the triangle distribution, V-shaped triangle distribution are used. By applying Monte Carlo simulation, it can be concluded that the performance of the Kolmogorov-Smirnov test is better than the G test in many cases, while the G test performs well in some cases

    Contributions to the problem of goodness-of-fit

    Get PDF
    This dissertation consists of two parts. The first part pertains to residual goodness-of-fit analysis. The second part pertains to the possibility of bringing an asymptotic multi-decision point of view to goodness-of-fit analysis;Goodness-of-fit testing based on the Correlated (OLS) residuals for the standard linear regression model are investigated in the first part. A modification of the U-statistic is given and appropriate quantiles are discussed. Three new statistics for testing normality are introduced;Goodness-of-fit is also studied through orthogonally transformed residuals. Test size and power are studied for a new vector of transformed residuals (r*), as well as for Theil\u27s BLUS residuals and comparisons are made with the result obtained for the OLS residuals;In the second part, the sense in which bivariate large deviations are pertinent to the three-decision view of goodness-of-fit is explored. Various approaches are discussed for computing or approximating the large deviation rates for the classification errors involved

    Goodness-of-fit statistics for location-scale distributions

    Get PDF
    This dissertation is concerned with the problem of assessing the fit of a hypothesized parametric family of distributions to data. A nontraditional use of the chi-square and likelihood ratio statistics is considered in which the number of cells is allowed to increase as the sample size increases. A new goodness-of-fit statistic k(\u272), based on the Pearson correlation coefficient of points of a P-P (percent versus percent) probability plot, is developed for testing departures from the normal, Gumbel, and exponential distributions. A statistic r(\u272) based on the Pearson correlation coefficient of points on a Q-Q (quantile versus quantile) probability plot is also considered. A new qualitative method based on the P-P probability plot is developed, for assessing the goodness of fit of nonhypothesized probability models to data. This method is not limited to location-scale distributions. Curves were fitted through the Monte Carlo percentiles to obtain formulas for the percentiles of k(\u272) and r(\u272) Statistics and Probability; An extensive Monte Carlo power comparison was performed for the normal, Gumbel, and exponential distributions. The statistics examined included those mentioned earlier, statistics based on the moments, statistics based on the empirical distribution function, and the commonly used Shapiro-Wilk statistic. The results of the power study are summarized, and general recommendations are given for the use of these Statistics and Probability

    A power comparison of various normality tests

    Full text link
    <p>The assumption of normality is very important because it is used in many statistical procedures such as Analysis of variance, linear regression analysis, discriminant analysis and t-tests. The three common procedures are used for assessing the assumption of normality that is graphical methods, numerical methods and formal normality tests. In the literature, significant amount of normality tests are available. In this paper, only eight different tests of normality are discussed. The tests consider in the present study are Shapiro Wilk, Shapiro Francia, Kolmogrov Smirnov, Anderson Darling, Cramer von Mises, Jarque Bera, Geary and Lilliefors test. Power comparisons of each test are obtained by using Monte Carlo computation of sample data generated from different alternate distributions by using 5% level of significance. The results show that power of each test is affected by sample size and alternate distribution. Shapiro Francia and Kolmogrov Smirnov test perform well for Cauchy exponential distribution respectively. For t-distribution Geary, Shapiro Francia and Jarque Bera test perform well for degrees of freedom 5, 10 and 15 respectively.</p

    Modelling sediment storage times in alluvial floodplains

    Get PDF
    Soil erosion rates are accelerating worldwide as climate change effects and human population pressures, including agricultural expansion, degrade the land surface. Fluvial systems transfer sediments from uplands to depositional landforms and basins downstream. However, only a fraction of eroded material will ultimately transfer to catchment outlets – a phenomenon termed, “the sediment delivery problem”. Thus, sediment fluxes to the coast are declining in many river catchments, as a result of storage behind dams and within landforms such as floodplains and alluvial terraces. Storage time allows us to measure the timescale of storage and removal of sediments from floodplains, which, given their spatial extent (8 x 105 to 2 x 106 km2 of all land area), are significant in interrupting the transmission of soil erosion fluxes downstream. While sediment storage times in alluvial floodplains have been quantified before, this thesis presents the first attempts to model the impacts of various environmental and experimental conditions on sediment storage behaviour using the CAESAR-Lisflood landscape evolution model. The thesis tests the following hypotheses: i) Removal rates from storage decline with increasing floodplain age; ii) the distribution of sediment storage times is sensitive to reach-specific characteristics, vegetation cover types and changes, changing river flows, and measurement frequency; and iii) a non-linear function can be fitted to the distribution and parameterised using readily quantifiable variables. A detailed literature review synthesised our current understanding of sediment storage times, including variables that have been quantified or hypothesised as possible controls. This culminated in a conceptual model of major controls and their interactions which was used to support the development of experiments tested in this thesis. A review of quantification techniques, including “black-box”, one-dimensional mass balance modelling approaches, and methods that calculate storage times directly from timings of geomorphic changes, justified adopting a landscape evolution modelling approach. CAESAR-Lisflood was applied to conduct this research, as it can simulate variable channel widths, divergent flow, and both braided and meandering planforms – capturing a wider range of channel-floodplain evolution processes than models previously used to simulate storage times. Ten 1 km-long reaches of river valleys from the north of England were used to calibrate the model, test the transferability of calibrated parameters, and verify the accuracy of simulated historical channel changes against mapped reconstructions. These simulations replicated mapped erosion, deposition and lateral migration rates reasonably well overall. Floodplain turnover times, estimated by extrapolating erosion rates, increased confidence that calibrated parameters were representative over longer timescales and revealed that all sediments stored in the floodplain would undergo exchange with the channel within 1000 years. Using CAESAR-Lisflood, an ensemble of 9 simulations, incorporating 3 of the 10 calibrated reaches and 3 vegetation cover scenarios (forest, grass and unvegetated) – each spanning 1000 years of river channel changes – was run. Together with measuring channel changes over four different frequencies (10, 20, 50 and 100 years), a total of 36 storage time distributions was modelled, with the age and storage times of floodplain sediments calculated from timings of deposition and erosion. This was done to test whether distributions were best fit by either an exponential or a heavy-tailed decay function, with the former indicating constant erosion rates over space and time, while the latter implies that removal rates from storage decay with increasing deposit age. As well as uniform vegetation conditions, a further 15 simulations, incorporating changes in vegetation cover or flow magnitudes over time, were run, to test how storage time dynamics respond to disturbance. This thesis demonstrates that sediment erosion rates decline with increasing floodplain age in most cases, with the strength of this relationship dependent on reach, floodplain erodibility and frequency of recorded measurements. A lognormal function can be fitted to distributions of sediment storage times in most cases, and it is possible to parameterise this function using the median storage time and measurement time-step. Coupling this storage time function with a model of stochastic sediment transport could generate predictions of decontamination times for a valley corridor enriched with polluted sediments (e.g. from mining). However, some environmental disturbances can be great enough to invalidate this storage time model – a challenge that merits further attention before application to practical environmental management contexts

    KOLMOGOROV-SMIRNOV TYPE TESTS UNDER SPATIAL CORRELATIONS

    Get PDF
    Kolmogorov-Smirnov test is a non-parametric hypothesis test that measures the probability of deviations, that the interested univariate random variable is drawn from a pre-specified distribution (one-sample KS) or has the same distribution as a second random variable (twosample KS). The test is based on the measure of the supremum (greatest) distance between an empirical distribution function (EDF) and a pre-specified cumulative distribution function (CDF) or the largest distance between two EDFs. KS test has been widely adopted in statistical analysis due to its virtue of more general assumptions compared to parametric test like t-test. In addition, the p-value derived from the KS test is more robust and distribution-free for a large class of random variables. However, the fundamental assumption of independence is usually overlooked and may potentially cause inaccurate inferences. The KS test in its original form assumes the interested random variable to be independently distributed while it’s not true in a lot of nature datasets, especially when we are dealing with more complicated situations like imgage analysis, geostatistical which may involve spatial dependence. I proposed a modified KS test with adjustment via spatial correlation. The dissertation concerns the following three aims. First, I conducted a systematical review on the KS test, the Cramer von Mise test, the Anderson-Darling test and the Chi-square test and evaluate their performance under normal distributions, Weibull distributions and multinomial distributions. In the review, I also studied how these tests perform when random variables are correlated. Second, I proposed a modified KS test that corrects the bias in estimating CDF/EDF when spatial dependence exists and calculate the informative sample size. Finally, I conducted a revisit analysis of coronary flow reserve and pixel distribution of coronary flow capacity by Kolmogorov-Smirnov with spatial correction to evaluate the efficiency of dipyridamole and regadenoson
    corecore