2,381 research outputs found

    Optimal Stratification of Univariate Populations via stratify R Package

    Get PDF
    Stratification reduces the variance of sample estimates for population parameters by creating homogeneous strata. Often, surveyors stratify the population using the most convenient variables such as age, sex, region, etc. Such convenient methods often do not produce internally homogeneous strata, hence, the precision of the estimates of the variables of interest could be further improved. This paper introduces an R-package called ’stratifyR’ whereby it proposes a method for optimal stratification of survey populations for a univariate study variable that follows a particular distribution estimated from a data set that is available to the surveyor. The stratification problem is formulated as a mathematical programming problem and solved by using a dynamic programming technique. Methods for several distributions such as uniform, weibull, gamma, normal, lognormal, exponential, right-triangular, cauchy and pareto are presented. The package is able to construct optimal stratification boundaries (OSB) and calculate optimal sample sizes (OSS) under Neyman allocation. Several examples, using simulated data, are presented to illustrate the stratified designs that can be constructed with the proposed methodology. Results reveal that the proposed method computes OSB that are precise and comparable to the established methods. All the calculations presented in this paper were carried out using the stratifyR package that will be made available on CRAN

    On optimum stratification

    Get PDF
    In this manuscript, we discuss the problem of determining the optimum stratification of a study (or main) variable based on the auxiliary variable that follows a uniform distribution. If the stratification of survey variable is made using the auxiliary variable it may lead to substantial gains in precision of the estimates. This problem is formulated as a Nonlinear Programming Problem (NLPP), which turn out to multistage decision problem and is solved using dynamic programming technique

    Heuristic Algorithm for Univariate Stratification Problem

    Full text link
    In sampling theory, stratification corresponds to a technique used in surveys, which allows segmenting a population into homogeneous subpopulations (strata) to produce statistics with a higher level of precision. In particular, this article proposes a heuristic to solve the univariate stratification problem - widely studied in the literature. One of its versions sets the number of strata and the precision level and seeks to determine the limits that define such strata to minimize the sample size allocated to the strata. A heuristic-based on a stochastic optimization method and an exact optimization method was developed to achieve this goal. The performance of this heuristic was evaluated through computational experiments, considering its application in various populations used in other works in the literature, based on 20 scenarios that combine different numbers of strata and levels of precision. From the analysis of the obtained results, it is possible to verify that the heuristic had a performance superior to four algorithms in the literature in more than 94% of the cases, particularly concerning the known algorithms of Kozak and Lavallee-Hidiroglou.Comment: 25 pages and 7 figure

    Stratification of skewed populations

    Get PDF
    In this research an algorithm is derived for stratifying skewed populations which is much simpler to implement than any of those currently available. It is based on the suggestion by numerous researchers in the field that it is desirable when stratifying skewed populations to arrange for equal coefficients of variation in each subinterval. Our new algorithm makes the breaks in geometric progression and achieves near-equal stratum coefficients of variation when the populations are skewed. Simulation studies on real skewed populations have shown that the new method compares favourably to those commonly used in terms of precision of the estimator of the mean. We also apply the geometric method to the Lavallée-Hidiroglou (1988) algorithm, an iterative method designed specifically for skewed populations. We show that by taking geometric boundaries as the starting points results in most cases in quicker convergence of the algorithm and achieves smaller sample sizes than the default starting points for the same precision. Finally, geometric stratification is applied to the Pareto distribution, a typical model of skewed data. We show that if any finite range of this distribution is broken into a given number of strata, with boundaries obtained using geometric progression, then the stratum coefficients of variation are equal

    A comparison of alternative recreation impact survey designs for northeast Iowa

    Get PDF

    Optimal stratification in stratified designs using weibull - distributed auxiliary information

    Get PDF
    Sampling has evolved into a universally accepted approach for gathering information and data mining as it is widely accepted that a reasonably modest-sized sample can sufficiently characterize a much larger population. In stratified sampling designs, the whole population is divided into homogeneous strata in order to achieve higher precision in the estimation. This paper proposes an efficient method of constructing optimum stratum boundaries (OSB) and determining optimum sample size (OSS) for the survey variable. The survey variable may not be available in practice since the variable of interest is unavailable prior to conducting the survey. Thus, the method is based on the auxiliary variable which is usually readily available from past surveys. To illustrate the application as an example using a real data, the auxiliary variable considered for this problem follows Weibull distribution. The stratification problem is formulated as a Mathematical Programming Problem (MPP) that seeks minimization of the variance of the estimated population parameter under Neyman allocation. The solution procedure employs the dynamic programming technique, which results in substantial gains in the precision of the estimates of the population characteristics

    Report of the Workshop on Survey Design and Data Analysis (WKSAD) [21- 25 June, 2004, Aberdeen, UK]

    Get PDF
    Contributors: Knut Korsbrekke, Michael Penningto

    The interpretation and characterisation of lineaments identified from Landsat TM imagery of SW England

    Get PDF
    Two Landsat TM scenes of SW England and a sub-scene of North Cornwall have been analysed visually in order to examine the effect of resolution on lineament interpretation. Images were viewed at several different scales as a result of varying image resolution whilst maintaining a fixed screen pixel size. Lineament analysis at each scale utilised GIS techniques and involved several stages: initial lineament identification and digitisation; removal of lineaments related to anthropogenic features to produce cleansed lineament maps; compilation of lineament attributes using ARC/INFO; cluster analysis for identification of lineament directional families; and line sampling of lineament maps in order to determine spacing. SW England lies within the temperate zone of Europe and the extensive agricultural cover and infrastructure conceal the underlying geology. The consequences of this for lineament analysis were examined using sub-images of North Cornwall. Here anthropogenic features are visible at all resolutions between 30m and 120m pixel sizes but lie outside the observation threshold at 150m. Having confidence that lineaments at this resolution are of non-anthropogenic origin optimises lineament identification since the image may be viewed in greater detail. On this basis, lineament analysis of SW England was performed using image resolutions of 150m. Valuable geological information below the observation threshold in 150m resolution images is likely, however, to be contained in the lineament maps produced from higher resolution images. For images analysed at higher resolutions, therefore, knowledge-based rules were established in order to cleanse the lineament populations. Compiled lineament maps were 'ground truthed' (primarily involving comparison with published geological maps but included phases of field mapping) in order to characterise their geological affinities. The major lineament trends were correlated to lithotectonic boundaries, and cross-cutting fractures sets. Major lineament trends produced distinct frequency/orientation maxima. Multiple minor geological structures, however, produced semi-overlapping groups. A clustering technique was devised to resolve overlapping groups into lineament directional families. The newly defined lineament directional families were further analysed in two ways: (i) Analysis of the spatial density of the length and frequency of lineaments indicates that individual and multiple lineament directional families vary spatially and are compartmentalised into local tectonic domains, often bounded by major lineaments. Hence, such density maps provide useful additional information about the structural framework of SW England. (ii) Lineament spacing and length of the lineament directional families were analysed for the effect of scale and geological causes on their frequency/size distributions. Spacing of fracture lineaments were found to be power-law, whereas lengths showed power-law and non-power-law distributions. Furthermore the type of frequency/size distribution for a lineament directional family can change with increasing resolution

    Sampling in the evaluation of ore deposits

    Get PDF
    Sampling is an error generating process and these errors should be reduced to a minimum if an accurate ore reserve estimation is to be made from the sample values. Error in sampling can arise from the sampling procedure as well as where and how each sample is taken from the deposit . Sampling procedure involves sample collection, sample reduction and analysis, and the error from each of these three stages has an equal influence on the total error of the process. Error due to sampling procedure should be identified and eliminated at an early stage in the evaluation programme. An ore deposit should be subdivided into sampling strata along geological boundaries, and once these boundaries have been established they should be adhered to for the evaluation programme. The sampling of each stratum depends on the small-scale structures in which the grade is distributed, and this distribution in relation to sample size controls sample variance, sample bias and the volume of influence of each sample. Cluster sampling can be used where an impractically large sample is necessary to reduce sample variance or increase the volume of influence of samples. Sample bias can be reduced by composing a large number of small samples . Sampling patterns should be designed with reference to the volumes of influence of samples, and in favourable geology, geostatistical or statistical techniques can be used to predict the precision of an ore reserve estimation 1n terms of the number of samples taken. Different are deposits have different sampling characteristics and problems which can be directly related to the geology of the mineralization. If geology is disregarded when sampling an are deposit, an evaluation programme cannot claim to give an accurate estimate of the ore reserves
    corecore