1,637 research outputs found

    A Grouping Genetic Algorithm for Joint Stratification and Sample Allocation Designs

    Full text link
    Predicting the cheapest sample size for the optimal stratification in multivariate survey design is a problem in cases where the population frame is large. A solution exists that iteratively searches for the minimum sample size necessary to meet accuracy constraints in partitions of atomic strata created by the Cartesian product of auxiliary variables into larger strata. The optimal stratification can be found by testing all possible partitions. However the number of possible partitions grows exponentially with the number of initial strata. There are alternative ways of modelling this problem, one of the most natural is using Genetic Algorithms (GA). These evolutionary algorithms use recombination, mutation and selection to search for optimal solutions. They often converge on optimal or near-optimal solution more quickly than exact methods. We propose a new GA approach to this problem using grouping genetic operators instead of traditional operators. The results show a significant improvement in solution quality for similar computational effort, corresponding to large monetary savings.Comment: 22 page

    Optimal Stratification and Allocation for the June Agricultural Survey

    Get PDF
    A computational approach to optimal multivariate designs with respect to stratification and allocation is investigated under the assumptions of fixed total allocation, known number of strata, and the availability of administrative data correlated with thevariables of interest under coefficient-of-variation constraints. This approach uses a penalized objective function that is optimized by simulated annealing through exchanging sampling units and sample allocations among strata. Computational speed is improved through the use of a computationally efficient machine learning method such as K-means to create an initial stratification close to the optimal stratification. The numeric stability of the algorithm has been investigated and parallel processing has been employed where appropriate. Results are presented for both simulated data and USDA’s June Agricultural Survey. An R package has also been made available for evaluation

    Stratification of skewed populations

    Get PDF
    In this research an algorithm is derived for stratifying skewed populations which is much simpler to implement than any of those currently available. It is based on the suggestion by numerous researchers in the field that it is desirable when stratifying skewed populations to arrange for equal coefficients of variation in each subinterval. Our new algorithm makes the breaks in geometric progression and achieves near-equal stratum coefficients of variation when the populations are skewed. Simulation studies on real skewed populations have shown that the new method compares favourably to those commonly used in terms of precision of the estimator of the mean. We also apply the geometric method to the Lavallée-Hidiroglou (1988) algorithm, an iterative method designed specifically for skewed populations. We show that by taking geometric boundaries as the starting points results in most cases in quicker convergence of the algorithm and achieves smaller sample sizes than the default starting points for the same precision. Finally, geometric stratification is applied to the Pareto distribution, a typical model of skewed data. We show that if any finite range of this distribution is broken into a given number of strata, with boundaries obtained using geometric progression, then the stratum coefficients of variation are equal

    Irrigated lands assessment for water management: Technique test

    Get PDF
    A procedure for estimating irrigated land using full frame LANDSAT imagery was demonstrated. Relatively inexpensive interpretation of multidate LANDSAT photographic enlargements was used to produce a map of irrigated land in California. The LANDSAT and ground maps were then linked by regression equations to enable precise estimation of irrigated land area by county, basin, and statewide. Land irrigated at least once in California in 1979 was estimated to be 9.86 million acres, with an expected error of less than 1.75% at the 99% level of confidence. To achieve the same level of error with a ground-only sample would have required 3 to 5 times as many ground sample units statewide. A procedure for relatively inexpensive computer classification of LANDSAT digital data to irrigated land categories was also developed. This procedure is based on ratios of MSS band 7 and 5, and gave good results for several counties in the Central Valley

    The auxiliary use of LANDSAT data in estimating crop acreages: Results of the 1975 Illinois crop-acreage experiment

    Get PDF
    The author has identified the following significant results. It was found that classifier performance was influenced by a number of temporal, methodological, and geographical factors. Best results were obtained when corn was tasselled and near the dough stage of development. Dates earlier or later in the growing season produced poor results. Atmospheric effects on results cannot be independently measured or completely separated from the effects due to the maturity stage of the crops. Poor classifier performance was observed in areas where considerable spectral confusion was present

    Use of transformed LANDSAT data in regression estimation of crop acreages

    Get PDF
    This study investigates the use of functions of a vector X as auxiliary variables in the regression estimation of the population mean with survey data. Functions of the vector X are estimated by estimating the unknown parameters of the transformation. Under certain assumptions, the error in the estimated parameters is order n(\u27- 1/2), where n is the sample size. The effect of estimating the auxiliary variables is investigated under the assumption that the finite population is a random sample of an infinite population. Asymptotic properties of regression estimators of the finite population mean constructed with estimated auxiliary variables are developed;The U.S. Department of Agriculture (USDA) uses satellite (LANDSAT) data to improve crop acreage estimates. LANDSAT data consist of a vector X of four radiation values in four wavelength bands of the electromagnetic spectrum. Based on these data, the USDA has developed a classification function for use as an auxiliary variable. In this study, other transformations of LANDSAT data are considered. The estimated posterior probability that a point with a satellite value of X is from crop j is developed as one auxiliary variable. Based on the estimated probability, a classification rule is constructed as another auxiliary variable;Data collected in northern Missouri by USDA are used in the study of alternative auxiliary variables. Two regressions are computed to evaluate the auxiliary variables. The first regression uses the individual pixels as observations, where a pixel is the unit of observation for the satellite data. In the second regression, both the dependent variable and the independent variable are constructed by summing pixel values over all the pixels in a segment, where the segment is the primary sampling unit in the survey. The estimated posterior probability transformation performs considerably better than the classification functions in the pixel regressions, but the posterior probability is only marginally superior to the classification function in the segment regressions;An estimator of the variance of the regression estimator based on an estimated auxiliary variable can be constructed using asymptotic theory. A form of the jackknife estimator of variance is compared with the estimator based on asymptotic theory. For a sample of 45 segments, it is estimated that the estimator based on the asymptotic formula underestimates the variance by 10 to 20 percent

    Report of the Workshop on Survey Design and Data Analysis (WKSAD) [21- 25 June, 2004, Aberdeen, UK]

    Get PDF
    Contributors: Knut Korsbrekke, Michael Penningto

    Soft-bottom fishes and spatial protection: findings from a temperate marine protected area

    Get PDF
    Numerous studies over the last decades have focused on marine protected areas (MPAs) and their effects on fish communities. However, there is a knowledge gap regarding how species that live associated with soft-substrates (e.g., sand, mud) respond to spatial protection. We analyzed abundance, biomass and total lengths of the soft-bottom fishes in a multiple-use MPA in the north-eastern Atlantic, the Luiz Saldanha Marine Park (Portugal), during and after the implementation of its management plan. Data were collected by experimental fishing in areas with three different levels of protection, during the implementation period and for three years after full implementation of the MPA. Univariate analysis detected significant biomass increases between the two periods. Fish assemblages were mainly structured by depth and substrate, followed by protection level. Community composition analyses revealed significant differences between protection levels and between the two periods. Species exhibited a broad variation in their response to protection, and we hypothesize that factors such as species habitat preferences, body size and late maturity might be underlying determinants. Overall, this study provides some evidence of protection effectiveness in soft-bottom fish communities, supported by the significant increase in biomass in the protected areas and the positive trends of some species.project LIFE-BIOMARES [LIFE06 NAT/P/000192]; project BUFFER (ERA-Net BiodivERsA); company SECIL-Companhia Geral de Cal e Cimento S.A.; FCT-Foundation for Science and Technology [CCMAR/Multi/04326/2013, SFRH/BD/80771/2011]; Foundation for Science and Technology [SFRH/BD/80771/2011]; 2012 Sesimbra Scientific Priz
    corecore