8,287 research outputs found

    Rational Arithmetic Mathematica Functions to Evaluate the One-sided One-sample K-S Cumulative Sample Distribution

    Get PDF
    One of the most widely used goodness-of-fit tests is the Kolmogorov-Smirnov (KS) family of tests which have been implemented by many computer statistical software packages. To calculate a p value (evaluate the cumulative sampling distribution), these packages use various methods including recursion formulae, limiting distributions, and approximations of unknown accuracy developed over thirty years ago. Based on an extensive literature search for the one-sided one-sample K-S test, this paper identifies two direct formulae and five recursion formulae that can be used to calculate a p value and then develops two additional direct formulae and four iterative versions of the direct formulae for a total of thirteen formulae. To ensure accurate calculation by avoiding catastrophic cancelation and eliminating rounding error, each formula is implemented in rational arithmetic. Linear search is used to calculate the inverse of the cumulative sampling distribution (find the confidence interval bandwidth). Extensive tables of bandwidths are presented for sample sizes up to 2, 000. The results confirm the hypothesis that as the number of digits in the numerator and denominator integers of the rational number test statistic increases, the computation time also increases. In comparing the computational times of the thirteen formulae, the direct formulae are slightly faster than their iterative versions and much faster than all the recursion formulae. Computational times for the fastest formula are given for sample sizes up to fifty thousand.

    Rational Arithmetic Mathematica Functions to Evaluate the Two-Sided One Sample K-S Cumulative Sampling Distribution

    Get PDF
    One of the most widely used goodness-of-fit tests is the two-sided one sample Kolmogorov-Smirnov (K-S) test which has been implemented by many computer statistical software packages. To calculate a two-sided p value (evaluate the cumulative sampling distribution), these packages use various methods including recursion formulae, limiting distributions, and approximations of unknown accuracy developed over thirty years ago. Based on an extensive literature search for the two-sided one sample K-S test, this paper identifies an exact formula for sample sizes up to 31, six recursion formulae, and one matrix formula that can be used to calculate a p value. To ensure accurate calculation by avoiding catastrophic cancelation and eliminating rounding error, each of these formulae is implemented in rational arithmetic. For the six recursion formulae and the matrix formula, computational experience for sample sizes up to 500 shows that computational times are increasing functions of both the sample size and the number of digits in the numerator and denominator integers of the rational number test statistic. The computational times of the seven formulae vary immensely but the Durbin recursion formula is almost always the fastest. Linear search is used to calculate the inverse of the cumulative sampling distribution (find the confidence interval half-width) and tables of calculated half-widths are presented for sample sizes up to 500. Using calculated half-widths as input, computational times for the fastest formula, the Durbin recursion formula, are given for sample sizes up to two thousand.

    Arbitrary Precision Mathematica Functions to Evaluate the One-Sided One Sample K-S Cumulative Sampling Distribution

    Get PDF
    Efficient rational arithmetic methods that can exactly evaluate the cumulative sampling distribution of the one-sided one sample Kolmogorov-Smirnov (K-S) test have been developed by Brown and Harvey (2007) for sample sizes n up to fifty thousand. This paper implements in arbitrary precision the same 13 formulae to evaluate the one-sided one sample K-S cumulative sampling distribution. Computational experience identifies the fastest implementation which is then used to calculate confidence interval bandwidths and p values for sample sizes up to ten million.

    S98RS SGR No. 14 (Finals)

    Get PDF
    A RESOLUTION To recommend to Executive Vice-Chancellor and Provost Dan Fogel, Robert Doolos of the University Registrars Office, and the University Calendar Committee to begin finals the Saturday after Dead Week and end the following Friday

    Rational Arithmetic Mathematica Functions to Evaluate the One-sided One-sample K-S Cumulative Sample Distribution

    Get PDF
    One of the most widely used goodness-of-fit tests is the Kolmogorov-Smirnov (KS) family of tests which have been implemented by many computer statistical software packages. To calculate a p value (evaluate the cumulative sampling distribution), these packages use various methods including recursion formulae, limiting distributions, and approximations of unknown accuracy developed over thirty years ago. Based on an extensive literature search for the one-sided one-sample K-S test, this paper identifies two direct formulae and five recursion formulae that can be used to calculate a p value and then develops two additional direct formulae and four iterative versions of the direct formulae for a total of thirteen formulae. To ensure accurate calculation by avoiding catastrophic cancelation and eliminating rounding error, each formula is implemented in rational arithmetic. Linear search is used to calculate the inverse of the cumulative sampling distribution (find the confidence interval bandwidth). Extensive tables of bandwidths are presented for sample sizes up to 2, 000. The results confirm the hypothesis that as the number of digits in the numerator and denominator integers of the rational number test statistic increases, the computation time also increases. In comparing the computational times of the thirteen formulae, the direct formulae are slightly faster than their iterative versions and much faster than all the recursion formulae. Computational times for the fastest formula are given for sample sizes up to fifty thousand

    Detecting outliers when fitting data with nonlinear regression ā€“ a new method based on robust nonlinear regression and the false discovery rate

    Get PDF
    BACKGROUND: Nonlinear regression, like linear regression, assumes that the scatter of data around the ideal curve follows a Gaussian or normal distribution. This assumption leads to the familiar goal of regression: to minimize the sum of the squares of the vertical or Y-value distances between the points and the curve. Outliers can dominate the sum-of-the-squares calculation, and lead to misleading results. However, we know of no practical method for routinely identifying outliers when fitting curves with nonlinear regression. RESULTS: We describe a new method for identifying outliers when fitting data with nonlinear regression. We first fit the data using a robust form of nonlinear regression, based on the assumption that scatter follows a Lorentzian distribution. We devised a new adaptive method that gradually becomes more robust as the method proceeds. To define outliers, we adapted the false discovery rate approach to handling multiple comparisons. We then remove the outliers, and analyze the data using ordinary least-squares regression. Because the method combines robust regression and outlier removal, we call it the ROUT method. When analyzing simulated data, where all scatter is Gaussian, our method detects (falsely) one or more outlier in only about 1ā€“3% of experiments. When analyzing data contaminated with one or several outliers, the ROUT method performs well at outlier identification, with an average False Discovery Rate less than 1%. CONCLUSION: Our method, which combines a new method of robust nonlinear regression with a new method of outlier identification, identifies outliers from nonlinear curve fits with reasonable power and few false positives

    Arbitrary Precision Mathematica Functions to Evaluate the One-Sided One Sample K-S Cumulative Sampling Distribution

    Get PDF
    Efficient rational arithmetic methods that can exactly evaluate the cumulative sampling distribution of the one-sided one sample Kolmogorov-Smirnov (K-S) test have been developed by Brown and Harvey (2007) for sample sizes n up to fifty thousand. This paper implements in arbitrary precision the same 13 formulae to evaluate the one-sided one sample K-S cumulative sampling distribution. Computational experience identifies the fastest implementation which is then used to calculate confidence interval bandwidths and p values for sample sizes up to ten million

    Rational Arithmetic Mathematica Functions to Evaluate the Two-Sided One Sample K-S Cumulative Sampling Distribution

    Get PDF
    One of the most widely used goodness-of-fit tests is the two-sided one sample Kolmogorov-Smirnov (K-S) test which has been implemented by many computer statistical software packages. To calculate a two-sided p value (evaluate the cumulative sampling distribution), these packages use various methods including recursion formulae, limiting distributions, and approximations of unknown accuracy developed over thirty years ago. Based on an extensive literature search for the two-sided one sample K-S test, this paper identifies an exact formula for sample sizes up to 31, six recursion formulae, and one matrix formula that can be used to calculate a p value. To ensure accurate calculation by avoiding catastrophic cancelation and eliminating rounding error, each of these formulae is implemented in rational arithmetic. For the six recursion formulae and the matrix formula, computational experience for sample sizes up to 500 shows that computational times are increasing functions of both the sample size and the number of digits in the numerator and denominator integers of the rational number test statistic. The computational times of the seven formulae vary immensely but the Durbin recursion formula is almost always the fastest. Linear search is used to calculate the inverse of the cumulative sampling distribution (find the confidence interval half-width) and tables of calculated half-widths are presented for sample sizes up to 500. Using calculated half-widths as input, computational times for the fastest formula, the Durbin recursion formula, are given for sample sizes up to two thousand

    Star formation in the inner galaxy: A far-infrared and radio study of two H2 regions

    Get PDF
    Far-infrared and radio continuum maps have been made of the central 6' of the inner-galaxy HII regions G30.8-0.0 (in the W43 complex) and G25.4-0.2, along with radio and molecular line measurements at selected positions. The purpose of this study is an effort to understand star formation in the molecular ring at 5 kpc in galactic radius. Measurements at several far infrared wavelengths allow the dust temperature structures and total far infrared fluxes to be determined. Comparison of the radio and infrared maps shows a close relationship between the ionized gas and the infrared-emitting material. There is evidence that parts of G30.8 are substantially affected by extinction, even at far-infrared wavelengths. Using radio recombination line and CO line data for G25.4-0.2, the distance ambiguity for this source is resolved. The large distance previously ascribed to the entire complex is found to apply to only one of the two main components. The confusion in distance determination is found to result from an extraordinary near-superposition of two bright HII regions. Using the revised distances of 4.3 kpc for G25.4SE and 12 kpc for G25.4NW, it is found that the latter, which is apparently the fainter of the two sources, is actually the more luminous. The ratio of total luminosity to ionizing luminosity is very similar to that of HII regions in the solar circle. Assuming a coeval population of ionizing stars, a normal initial mass function is indicated
    • ā€¦
    corecore