    The spatial distribution of soil properties and prediction of soil organic carbon in Hayden Prairie and an adjacent agricultural field

    While the effect of cultivation on soil properties has been well documented, its effect on the spatial distribution of soil properties is less well understood. The purpose of this study is to use GIS classes, soil map units and landscape positions, and geostatistics to characterize the spatial distribution of soil properties in a native prairie and agricultural field. A secondary purpose is to use soil color in combination with these techniques to predict soil organic carbon (SOC) content to 0.2 and 1.0m depths across each land use. Each land use was sampled in an unbalanced hierarchical nested grid for a total of 406 cores. Soil color was measured with Munsell Soil Color Book and chroma meter with three types of samples: (a) prepared samples, ground to \u3c2mm, (b) horizon peds, and (c) split cores (measurements taken at horizon and depth increment mid-points). Standard techniques were used to describe all cores and analyze a subset (63 in each land use) for soil organic carbon (SOC), bulk density, percent water stable aggregates (WSA), pH, and surface horizon texture. Bulk density, pH, and WSA are not spatially dependent using any technique. Using GIS classes, the prairie has more significant differences in soil properties between classes. The agricultural field is more homogenous, but geostatistics show it has spatial dependence with small-scale continuity. SOC content distribution is related to localized, mid-slope wetness in the prairie that no longer occurs in the agricultural field due to artificial drainage. Only a few models in this study were generally satisfactory for predicting SOC contents. SOC content is significantly related to soil color, on individual samples, but not entire cores. The best predictor of SOC content is topographic wetness index in the agricultural field and kriging and co-kriging in the prairie. Average land use predictions vary by 2.4 kg m-2 for 0.2m and 3.8 kg m-2 for 1.0m in the agriculture field and 6.2 kg m-2 for 0.2m and 19.0 kg m-2 for 1.0m in the prairie. Agricultural cultivation has changed the distribution of SOC across the landscape and thus different models are needed to make accurate predictions

    Soil Property and Class Maps of the Conterminous US at 100 meter Spatial Resolution based on a Compilation of National Soil Point Observations and Machine Learning

    With growing concern for the depletion of soil resources, conventional soil data must be updated to support spatially explicit human-landscape models. Three US soil point datasetswere combined with a stack of over 200 environmental datasets to generate complete coverage gridded predictions at 100 m spatial resolution of soil properties (percent organic C, total N, bulk density, pH, and percent sand and clay) and US soil taxonomic classes (291 great groups and 78 modified particle size classes) for the conterminous US. Models were built using parallelized random forest and gradient boosting algorithms. Soil property predictions were generated at seven standard soil depths (0, 5, 15, 30, 60, 100 and 200 cm). Prediction probability maps for US soil taxonomic classifications were also generated. Model validation results indicate an out-of-bag classification accuracy of 60 percent for great groups, and 66 percent for modified particle size classes; for soil properties cross-validated R-square ranged from 62 percent for total N to 87 percent for pH. Nine independent validation datasets were used to assess prediction accuracies for soil class models and results ranged between 24-58 percent and 24-93 percent for great group and modified particle size class prediction accuracies, respectively. The hybrid "SoilGrids+" modeling system that incorporates remote sensing data, local predictions of soil properties, conventional soil polygon maps, and machine learning opens the possibility for updating conventional soil survey data with machine learning technology to make soil information easier to integrate with spatially explicit models, compared to multi-component map units.Comment: Submitted to Soil Science Society of America Journal, 40 pages, 12 figures, 3 table

    Human Land-Use and Soil Change

    Soil change is the central, if under-recognized, component of land and ecosystem changes (Yaalon 2007). Soils change naturally over a long timescale (decades to millennia) in response to soil-forming factors (biota, climate, parent material, time, and topography). However, human land-use pressures are currently the driving force in maintaining, aggrading, and degrading soil properties across nearly all ecosystems. Traditionally, in order to simplify and standardize the relationships between soils and soil-forming factors, pedology and soil survey have often focused on ā€œnaturalā€ or ā€œvirginā€ soil (e.g., Hilgard 1860; Jenny 1980), but many argue that humans should be thought of as a part of soil genesis and formation (Amundson and Jenny 1991; Yaalon and Yaron 1966; Bidwell and Hole 1965). Landscapes and soils have been altered by wide-scale conversion to agriculture, use of vegetative products, and development for direct human use. Land-use impacts can be gradual or abrupt, subtle, or catastrophic (Table 18.1). The interactions between environmental changes and geomorphic and biotic feedback loops vary across temporal and spatial scales depending on the setting (Monger and Bestelmeyer 2006). The effects of land use can linger for decades to centuries and beyond (Hall et al. 2013; Jangid et al. 2011; Sandor et al. 1986). While each land resource region has some specific soilā€“land use interactions, this chapter will focus on general uses and topical areas: croplands, wetlands, grazing lands (both pasture and rangelands), and forest lands with smaller sections devoted to special issues including acid sulfate soils, strip-mined lands, and cold soils

    Understanding saturated hydraulic conductivity under seasonal changes in climate and land use

    The goal of this study was to understand better the co-play of intrinsic soil properties and extrinsic factors of climate and management in the estimation of saturated hydraulic conductivity (Ksat) in intensively managed landscapes. For this purpose, a physically-based, modeling framework was developed using hydro-pedotransfer functions (PTFs) and watershed models integrated with Geographic Information System (GIS) modules. The integrated models were then used to develop Ksat maps for the Clear Creek, Iowa watershed and the state of Iowa. Four types of saturated hydraulic conductivity were considered, namely the baseline (Kb), the bare (Kbr), the effective with no-rain (Ke-nr) and the effective (Ke) in order to evaluate how management and seasonality affect Ksat spatiotemporal variability. Kb is dictated by soil texture and bulk density, whereas Kbr, Ke-nr, and Ke are driven by extrinsic factors, which vary on an event to seasonal time scale, such as vegetation cover, land use, management practices, and precipitation. Two seasons were selected to demonstrate Ksat dynamics in the Clear Creek watershed, IA and the state of Iowa; specifically, the months of October and April that corresponded to the before harvesting and before planting conditions, respectively. Statistical analysis of the Clear Creek data showed that intrinsic soil properties incorporated in Kb do not reflect the degree of soil surface disturbance due to tillage and raindrop impact. Additionally, vegetation cover affected the infiltration rate. It was found that the use of Kbinstead of Ke in water balance studies can lead to an overestimation of the amount of water infiltrated in agricultural watersheds by a factor of two. Therefore, we suggest herein that Keis both the most dynamic and representative saturated hydraulic conductivity for intensively managed landscapes because it accounts for the contributions of land cover and management, local hydropedology and climate condition, which all affect the soil porosity and structure and hence, Ksat

    Machine learning for predicting soil classes in three semi-arid landscapes

    Mapping the spatial distribution of soil taxonomic classes is important for informing soil use and management decisions. Digital soil mapping (DSM) can quantitatively predict the spatial distribution of soil taxonomic classes. Key components of DSM are the method and the set of environmental covariates used to predict soil classes. Machine learning is a general term for a broad set of statistical modeling techniques. Many different machine learning models have been applied in the literature and there are different approaches for selecting covariates for DSM. However, there is little guidance as to which, if any, machine learning model and covariate set might be optimal for predicting soil classes across different landscapes. Our objective was to compare multiple machine learning models and covariate sets for predicting soil taxonomic classes at three geographically distinct areas in the semi-arid western United States of America (southern New Mexico, southwestern Utah, and northeastern Wyoming). All three areas were the focus of digital soil mapping studies. Sampling sites at each study area were selected using conditioned Latin hypercube sampling (cLHS). We compared models that had been used in other DSM studies, including clustering algorithms, discriminant analysis, multinomial logistic regression, neural networks, tree based methods, and support vector machine classifiers. Tested machine learning models were divided into three groups based on model complexity: simple, moderate, and complex. We also compared environmental covariates derived from digital elevation models and Landsat imagery that were divided into three different sets: 1) covariates selected a priori by soil scientists familiar with each area and used as input into cLHS, 2) the covariates in set 1 plus 113 additional covariates, and 3) covariates selected using recursive feature elimination. Overall, complex models were consistently more accurate than simple or moderately complex models.Random forests (RF) using covariates selected via recursive feature elimination was consistently most accurate, or was among the most accurate, classifiers sets within each study area. We recommend that for soil taxonomic class prediction, complex models and covariates selected by recursive feature elimination be used. Overall classification accuracy in each study area was largely dependent upon the number of soil taxonomic classes and the frequency distribution of pedon observations between taxonomic classes. 43 Individual subgroup class accuracy was generally dependent upon the number of soil pedon 44 observations in each taxonomic class. The number of soil classes is related to the inherent variability of a given area. The imbalance of soil pedon observations between classes is likely related to cLHS. Imbalanced frequency distributions of soil pedon observations between classes must be addressed to improve model accuracy. Solutions include increasing the number of soil pedon observations in classes with few observations or decreasing the number of classes. Spatial predictions using the most accurate models generally agree with expected soil-landscape relationships. Spatial prediction uncertainty was lowest in areas of relatively low relief for each study area

    Patterns of soil organic carbon deficit in the Conterminous US

    Soil organic carbon (SOC) is integral to ecosystem stability, agricultural productivity, and climate regulation. The capacity of soils to stabilize additional SOC is highly uncertain, yet critical to understanding terrestrial ecosystem feedbacks to rising atmospheric carbon dioxide. Conceptually, the difference between a soil's capacity to sequester SOC (i.e., SOC saturation) and the current stock of SOC is the SOC deficit. To explore patterns in SOC deficit, we are using the newly collected, Rapid soil Carbon Assessment project (RaCA) database. The RaCA database contains more than 6000 pedons from the Conterminous US collected to 1 m of depth or more. Data for these pedons include SOC, soil inorganic C, total soil nitrogen, texture class, VNIR spectrum, and more, all of which are traceable and publically available from the USDA Natural Resources Conservation Service. The general goal of this paper is to test hypotheses about the roles of soil depth, land use and land cover, landscape position, and ecosystem type play in maximizing SOC content. We use existing methods to determine SOC saturation as related to soil texture classes, soil surface area, and statistical approaches. One specific goal is to determine the probability of SOC deficit in sequential soil horizons. For example, we test if SOC saturation is related to the saturation of proximal soil horizons and then explore land use and spatial patterns in the probability that SOC saturation is related to proximity. Understanding patterns in SOC deficit may be valuable to modelling terrestrial C dynamics in the Anthropocene

    Predicting soil bulk density for incomplete databases

    Soil bulk density (Ļb) is important because of its direct effect on soil properties (e.g., porosity, soilmoisture availability) and crop yield. Additionally, Ļb measurements are needed to express soil organic carbon (SOC) and other nutrient stocks on an area basis (kg haāˆ’1). However, Ļbmeasurements are commonlymissing fromdatabases for reasons that include omission due to sampling constraints and laboratory mishandling. The objective of this study was to investigate the performance of novel pedotransfer functions (PTFs) in predicting Ļb as a function of textural class and basic pedon description information extracted from the horizon of interest (the horizon for which Ļb is being predicted), and Ļb, textural class, and basic pedon description information extracted from horizons above or below and directly adjacent or not adjacent to the horizon of interest. A total of 2,680 pedons (20,045 horizons) were gathered from the USDA-NRCS National Soil Survey Center characterization database. Twelve Ļb PTFs were developed by combining PTF types, database configurations, and horizon limiting depths. Different PTF types were created considering the direction of prediction in the soil profile: upward and downward prediction models. Multiple database configurations were used to mimic different scenarios of horizons missing Ļb values: random missing (e.g., Ļb sample lost in transit) and patterned or systematic missing (e.g., no Ļb samples collected for horizons N 30 cm depth). For each database configuration scenario, upward and downward models were developed separately. Three limiting depths (20, 30, and 50 cm) were tested to identify any threshold depth between upward and downward models. For both PTF types, validation results indicated thatmodels derived from the database configuration mimicking randomhorizonsmissing Ļb performed better than those derived from the configuration mimicking clear patterns of missing Ļb measurements. All 12 PTFs performed well (RMSPE: 0.10ā€“0.15 g cmāˆ’3). The threshold depth of 50 cm most successfully split the database between upward and downward models. For all PTFs, the Ļb of other horizons in the soil profile was the most important variable in predicting Ļb. The proposed PTFs provide reasonably accurate Ļb predictions, and have the potential to help researchers and other users to fill gaps in their database without complicated data acquisition

    A soil bulk density pedotransfer function based on machine learning : A case study with the ncss soil characterization database

    This paper describes a method to develop a soil bulk density pedotransfer function (PTF) using the Random Forest machine-Learning algorithm with soil and environmental data for the conterminous United States. Complete data from 45,818 horizons were extracted from the National Cooperative Soil Survey (NCSS) soil characterization database and used to calibrate and validate the PTF. Environmental data included surficial materials and hierarchical ecosystem land classifications. The results of a five-fold cross-validation showed that the average root mean squared prediction error (RMSPE) was 0.13 g cm-3, and the mean prediction error (MPE) was -0.001 g cm-3. An illustrative example of a weight-to-area conversion using the PTF was done with soil organic carbon (SOC) stocks. The fitted PTF can be used to fill in data gaps for volumetric assessments, as was done for SOC stock calculations. It could also be used with other international soil datasets if environmental data for surficial materials and ecoregion province can be determined and related to categories present in the United States. The PTF model and the resulting bulk density estimates are available for use under an Open Data license and can be accessed from Harvard Dataverse