23 research outputs found

    Characterizing Clustering Models of High-dimensional Remotely Sensed Data Using Subsampled Field-subfield Spatial Cross-validated Random Forests

    Get PDF
    Clustering models are regularly used to construct meaningful groups of observations within complex datasets, and they are an exceptional tool for spatial exploratory analysis. The clusters detected in a recent spatio-temporal cluster analysis of leaf area index (LAI) in the Columbia River Basin (CRB) require further investigation since they are only derived using a single greenness metric. It is of great interest to further understand how greening indices can be used to determine separation of sites across an array of remotely sensed environmental attributes. In this prior work, there are highly localized minority clusters that were detected to be most dissimilar from the remaining clusters as determined by annual variation in remotely sensed LAI. The objective of this study is to discern what other environmental factors are important predictors of cluster allocation from the mentioned cluster analysis, and secondarily, to construct a predictive model that prioritizes minority clusters. A random forest classification is considered to examine the importance of various site attributes in predicting cluster allocation. To satisfy these objectives, I propose an application-specific process that integrates spatial sub-sampling and cross-validation to improve the interpretability and utility of random forests for spatially autocorrelated, highly-localized, and unbalanced class-size response variables. The final random forest model identifies that the cluster allocation, using only LAI, separates sites significantly across many other environmental attributes, and further that elevation, slope, and water storage potential are the most important predictors of cluster allocation. Most importantly, the class errors rates for the clusters that are most dissimilar, as detected by the cluster model, have the best misclassification rates which fulfills the secondary objective of aligning the priorities of a predictive model with a prior cluster model

    Smoothing Splines of Apex Predator Movement: Functional Modeling Strategies for Exploring Animal Behavior and Social Interactions

    Get PDF
    The collection of animal position data via GPS tracking devices has increased in quality and usage in recent years. Animal position and movement, although measured discretely, follows the same principles of kinematic motion, and as such, the process is inherently continuous and differentiable. I demonstrate the functionality and visual elegance of smoothing spline models. I discuss the challenges and benefits of implementing such an approach, and I provide an analysis of movement and social interaction of seven jaguars inhabiting the Taiamã Ecological Station, Pantanal, Brazil, a region with the highest known density of jaguars. In the analysis, I derive measures for pairwise distance, cooccurrence, and spatiotemporal association between jaguars, borrowing ideas from density estimation and information theory. These measures are feasible as a result of spline model estimation, and they provide a critical tool for a deeper investigation of cooccurrence duration, frequency, and localized spatio-temporal relationships between animals. In this work, I characterize a variety of interactive relationships between pairs of jaguars, and I particularly emphasize the relationships in movement of two male–female and two male–male jaguar pairs exhibiting highly associative relationships

    Effectuation As Ineffectual? Applying the 3E Theory-Assessment Framework to a Proposed New Theory of Entrepreneurship

    Full text link

    Surviving a Civil War: Expanding the Scope of Survival Analysis in Political Science

    Get PDF
    Survival Analysis in the context of Political Science is frequently used to study the duration of agreements, political party influence, wars, senator term lengths, etc. This paper surveys a collection of methods implemented on a modified version of the Power-Sharing Event Dataset (which documents civil war peace agreement durations in the Post-Cold War era) in order to identify the research questions that are optimally addressed by each method. A primary comparison will be made between a Cox Proportional Hazards Model using some advanced capabilities in the glmnet package, a Survival Random Forest Model, and a Survival SVM. En route to this comparison, issues including Cox Model variable selection using the LASSO, identification of clusters using Hierarchal Clustering, and discretizing the response for Classification Analysis will be discussed. The results of the analysis will be used to justify the need and accessibility of the Survival Random Forest algorithm as an additional tool for survival analysis

    Two decades of fumigation data from the Soybean Free Air Concentration Enrichment facility

    No full text
    Abstract The Soybean Free Air Concentration Enrichment (SoyFACE) facility is the longest running open-air carbon dioxide and ozone enrichment facility in the world. For over two decades, soybean, maize, and other crops have been exposed to the elevated carbon dioxide and ozone concentrations anticipated for late this century. The facility, located in East Central Illinois, USA, exposes crops to different atmospheric concentrations in replicated octagonal ~280 m2 Free Air Concentration Enrichment (FACE) treatment plots. Each FACE plot is paired with an untreated control (ambient) plot. The experiment provides important ground truth data for predicting future crop productivity. Fumigation data from SoyFACE were collected every four seconds throughout each growing season for over two decades. Here, we organize, quality control, and collate 20 years of data to facilitate trend analysis and crop modeling efforts. This paper provides the rationale for and a description of the SoyFACE experiments, along with a summary of the fumigation data and collation process, weather and ambient data collection procedures, and explanations of air pollution metrics and calculations

    Pines

    Get PDF
    Pinus is the most important genus within the Family Pinaceae and also within the gymnosperms by the number of species (109 species recognized by Farjon 2001) and by its contribution to forest ecosystems. All pine species are evergreen trees or shrubs. They are widely distributed in the northern hemisphere, from tropical areas to northern areas in America and Eurasia. Their natural range reaches the equator only in Southeast Asia. In Africa, natural occurrences are confined to the Mediterranean basin. Pines grow at various elevations from sea level (not usual in tropical areas) to highlands. Two main regions of diversity are recorded, the most important one in Central America (43 species found in Mexico) and a secondary one in China. Some species have a very wide natural range (e.g., P. ponderosa, P. sylvestris). Pines are adapted to a wide range of ecological conditions: from tropical (e.g., P. merkusii, P. kesiya, P. tropicalis), temperate (e.g., P. pungens, P. thunbergii), and subalpine (e.g., P. albicaulis, P. cembra) to boreal (e.g., P. pumila) climates (Richardson and Rundel 1998, Burdon 2002). They can grow in quite pure stands or in mixed forest with other conifers or broadleaved trees. Some species are especially adapted to forest fires, e.g., P. banksiana, in which fire is virtually essential for cone opening and seed dispersal. They can grow in arid conditions, on alluvial plain soils, on sandy soils, on rocky soils, or on marsh soils. Trees of some species can have a very long life as in P. longaeva (more than 3,000 years)
    corecore