60 research outputs found

    Biodiversity and Scale: Determinants of Species Richness in Great Smoky Mountains National Park

    Get PDF
    Species richness is the number of species in a given area or sample and is the most fundamental measure of biodiversity. It results from the aggregation of individual species whose distributions are influenced by processes operating on a wide range of scales. Estimating and understanding species richness at landscape scales (103-106 ha) is not easily achieved from small sample areas that can be completely inventoried. In particular the spatial structure of environments makes the richness observations across a landscape non-additive. This dissertation develops the vital links between the spatial structure of ecological factors that are hypothesized to control species richness, spatial variation in species composition, and the sampling strategies used to measure species richness. I present a method for objectively and iteratively assessing patterns of biodiversity. This method builds upon "ecological zipcodes" that classify the landscape by energy flux, temperature, and precipitation. I also present a model of human energetic expenditure during walking that can be applied at landscape scales. I use this model to analyze sampling bias associated with accessibility for vegetation surveys. I used both the "ecological zipcodes" and the model of accessibility to design efficient and representative biodiversity samples based on clustered-stratified sampling. Finally, I assess the reliability of richness estimators that incorporate turnover in species composition. My results illustrate that efficient and representative richness assessment is possible, even with little a priori knowledge about the spatial structure of species richness. They also demonstrate that typical biodiversity assessments show a strong bias in accessibility that is both a product of the spatial structuring of samples as well as environment. This bias is significant even for small biases in sample accessibility. Also, I show that though clustered sampling designs capture multiple scales of aggregation, their representativeness is very sensitive to stratification. Finally, my results show that species richness estimates that incorporate turnover are confounded by the interaction between sample size and environmental heterogeneity. Only when controlling for these effects, can information about the spatial turnover in species composition be effective in estimating species richness

    Comparing Linear Discriminant Analysis with Classification Trees Using Forest Landowner Survey Data as a Case Study with Considerations for Optimal Biorefinery Siting

    Get PDF
    Bioenergy is reemerging as an important topic in energy‐related research. The rapid increase in costs of petroleum products has led to a renewed interest in alternative sources of energy such as biofuels. World‐wide energy consumption has increased 17 times in the last century and the demand for energy in emerging markets such as China and India is projected to increase in the future at unprecedented rates. A review of the current bioenergy literature is presented in this thesis. Also, comments on the economics of bioethanol are discussed. The primary part of the thesis focuses on statistical classification methods related to factors that influence landowner attitudes towards harvesting timber. A comparison of linear discriminant analysis (LDA) and classification tree (CT) methods is presented using the results of a forest landowner survey as a case study. Several CT techniques are discussed with an emphasis on the CRUISE classification tree program. The LDA procedure in SPSS is used to construct linear discriminant functions of the survey results. CRUISE is also used to construct classification trees of the survey results. Survey results showed that 73.3 percent of farmer forest landowners harvested timber, and 69.6 percent of non‐farmers who had a length of residency beyond 36.5 years harvested timber. For landowners who conducted commercial timber harvests, the importance level of income from the harvest was the overriding factor relative to all other factors. Discriminant analysis results supported the results of CTs. However, the linear discrimination functions and corresponding coefficients did not provide the level of two‐dimensional detail of CTs, which also detected hidden interactions

    Late-Stage Breast Cancer Diagnosis and Health Care Access in Illinois∗

    Get PDF
    The variations of breast cancer mortality rates from place to place reflect both underlying differences in breast cancer prevalence and differences in diagnosis and treatment that affect the risk of death. This article examines the role of access to health care in explaining the variation of late-stage diagnosis of breast cancer. We use cancer registry data for the state of Illinois by zip code to investigate spatial variation in late diagnosis. Geographic information systems and spatial analysis methods are used to create detailed measures of spatial access to health care such as convenience of visiting primary care physicians and travel time from the nearest mammography facility. The effects of spatial access, in combination with the influences of socioeconomic factors, on late-stage breast cancer diagnosis are assessed using statistical methods. The results suggest that for breast cancer, poor geographical access to primary health care significantly increases the risk of late diagnosis for persons living outside the city of Chicago. Disadvantaged population groups including those with low income and racial and ethnic minorities tend to experience high rates of late diagnosis. In Illinois, poor spatial access to primary health care is more strongly associated with late diagnosis than is spatial access to mammography. This suggests the importance of primary care physicians as gatekeepers in early breast cancer detection

    Syndrome Surveillance Using Parametric Space-Time Clustering

    Full text link

    Spatial Estimation of Radon Exposure for Epidemiologic Risk Assessment

    Get PDF
    Radon is a naturally occurring radioactive gas and is an intermediate product of the decay of uranium. Exposure to radon is the second leading cause of lung cancer in the United States and is hypothesized to cause strokes and other cardiovascular events. Additionally, radon levels seem to be rising across North America and may be linked to climate change. In the early 1990’s, the US EPA created a map of three distinct radon zones, classified according to indoor radon measurements (pCi/L) from the State Residential Radon Survey (SRRS), aerial radioactivity (ppm eU), geology, soil permeability and architecture type called the geologic radon potential (GRP) map. The goal of this analysis is to create an improved, granular spatial model for the geographic distribution of radon based on the SRRS data while accounting for other spatially dependent factors included in the calculation of GRP. The models used in this study include kriging, latent process modeling, LOESS and an ensemble estimation approach. The best model among these in terms of predictive accuracy was the ensemble estimation approach, with a mean absolute error of 2.059 pCi/L when fit to a 3 by 3 coordinate region in middle Tennessee. Future work on this project may include integration of additional data sets, inclusion of a temporal component leveraging more recent radon measurement data, further development of the ensemble estimation approach, accounting for the bias in the sampling design of the SRRS, or incorporating additional modeling approaches such as inverse distance weighted mean and nearest neighboring measure.Bachelor of Scienc

    The Tao of Inference in Privacy-Protected Databases

    Get PDF
    To protect database confidentiality even in the face of full compromise while supporting standard functionality, recent academic proposals and commercial products rely on a mix of encryption schemes. The common recommendation is to apply strong, semantically secure encryption to the “sensitive” columns and protect other columns with property-revealing encryption (PRE) that supports operations such as sorting. We design, implement, and evaluate a new methodology for inferring data stored in such encrypted databases. The cornerstone is the multinomial attack, a new inference technique that is analytically optimal and empirically outperforms prior heuristic attacks against PRE-encrypted data. We also extend the multinomial attack to take advantage of correlations across multiple columns. These improvements recover PRE-encrypted data with sufficient accuracy to then apply machine learning and record linkage methods to infer the values of columns protected by semantically secure encryption or redaction. We evaluate our methodology on medical, census, and union-membership datasets, showing for the first time how to infer full database records. For PRE-encrypted attributes such as demographics and ZIP codes, our attack outperforms the best prior heuristic by a factor of 16. Unlike any prior technique, we also infer attributes, such as incomes and medical diagnoses, protected by strong encryption. For example, when we infer that a patient in a hospital-discharge dataset has a mental health or substance abuse condition, this prediction is 97% accurate

    Progression of a large syphilis outbreak in rural North Carolina through space and time: Application of a Bayesian Maximum Entropy graphical user interface

    Get PDF
    In 2001, the primary and secondary syphilis incidence rate in rural Columbus County, North Carolina was the highest in the nation. To understand the development of syphilis outbreaks in rural areas, we developed and used the Bayesian Maximum Entropy Graphical User Interface (BMEGUI) to map syphilis incidence rates from 1999–2004 in seven adjacent counties in North Carolina. Using BMEGUI, incidence rate maps were constructed for two aggregation scales (ZIP code and census tract) with two approaches (Poisson and simple kriging). The BME maps revealed the outbreak was initially localized in Robeson County and possibly connected to more urban endemic cases in adjacent Cumberland County. The outbreak spread to rural Columbus County in a leapfrog pattern with the subsequent development of a visible low incidence spatial corridor linking Roberson County with the rural areas of Columbus County. Though the data are from the early 2000s, they remain pertinent, as the combination of spatial data with the extensive sexual network analyses, particularly in rural areas gives thorough insights which have not been replicated in the past two decades. These observations support an important role for the connection of micropolitan areas with neighboring rural areas in the spread of syphilis. Public health interventions focusing on urban and micropolitan areas may effectively limit syphilis indirectly in nearby rural areas
    • 

    corecore