1,133 research outputs found
A Bayesian network approach to county-level corn yield prediction using historical data and expert knowledge
Machine learning has become a popular technology that has not only turbo-charged the existing problems in the AI but it has also emerged as the powerful toolkit to solve some of the interesting problems across the various interdisciplinary domains.
The availability of food is the biggest problem of the 21st century and many experts have raised their concerns as we continue to see a rise in the global human population. There have been many efforts in this direction which include but not limited to improvement in the seeds quality, good management practices, prior knowledge about the expected yield, etc.
In this work, we propose a data-driven approach that is ‘gray box’ i.e. that seamlessly utilizes expert knowledge in constructing a statistical network model for corn yield forecasting. Our multivariate gray box model is developed on Bayesian network analysis to build a Directed
Acyclic Graph (DAG) between predictors and yield. Starting from a complete graph connecting various carefully chosen variables and yield, expert knowledge is used to prune or strengthen edges connecting variables. Subsequently, the structure (connectivity and edge weights) of the DAG that maximizes the likelihood of observing the training data is identified via optimization. We curated an extensive set of historical data (1948 − 2012) for each of the 99 counties in Iowa as data to train the model. We discuss preliminary results, and specifically focus on (a) the structure of the learned network and how it corroborates with known trends, and (b) how partial information still produces reasonable predictions (predictions with gappy data), and show that incorporating the missing information improves predictions
Predicting county level corn yields using deep, long, short-term memory models in the Corn Belt
Having an accurate corn yield prediction is useful because it provides information about production and equilibrium post-harvest futures price prior to harvest. A publicly available corn yield prediction can help address emergent information asymmetry problems and, in doing so, improve price efficiency on futures markets. This paper is the first to predict corn yield using Long Short-Term Memory (LSTM), a special Recurrent Neural Network method. Our prediction is only 0.83 bushel/acre lower than actual corn yields in the Corn Belt, and is more accurate than the pre-harvest prediction from the USDA. And more importantly, our model provides a publicly available source that will contribute to eliminating the information asymmetry problem that arises from private sector crop yield prediction
Coupling Machine Learning and Crop Modeling Improves Crop Yield Prediction in the US Corn Belt
This study investigates whether coupling crop modeling and machine learning
(ML) improves corn yield predictions in the US Corn Belt. The main objectives
are to explore whether a hybrid approach (crop modeling + ML) would result in
better predictions, investigate which combinations of hybrid models provide the
most accurate predictions, and determine the features from the crop modeling
that are most effective to be integrated with ML for corn yield prediction.
Five ML models (linear regression, LASSO, LightGBM, random forest, and XGBoost)
and six ensemble models have been designed to address the research question.
The results suggest that adding simulation crop model variables (APSIM) as
input features to ML models can decrease yield prediction root mean squared
error (RMSE) from 7 to 20%. Furthermore, we investigated partial inclusion of
APSIM features in the ML prediction models and we found soil moisture related
APSIM variables are most influential on the ML predictions followed by
crop-related and phenology-related variables. Finally, based on feature
importance measure, it has been observed that simulated APSIM average drought
stress and average water table depth during the growing season are the most
important APSIM inputs to ML. This result indicates that weather information
alone is not sufficient and ML models need more hydrological inputs to make
improved yield predictions
Bayesian network development and validation for siting selection
In this study, increasing electricity demand requires considerable attention to increasing the diversity of power generation. Alternative energy can produce heating and power systems and thermal storage. Our objective and every organization’s objectives are to minimize its energy consumption cost under electricity demand uncertainty. In rural areas, heat and power availability and stability are also crucial. Combined heat and power have proven their effectiveness as a subsequent to Electricity. This paper identified four criteria and eleven sub-criteria to determine the most appropriate structure location for combined heat and power in the rural community. The Bayesian Network technology has been applied to analyze these criteria comprehensively. A case study including multiple sites across the Mississippi state was used to validate the proposed approach, and propagation and sensitivity analysis were used to evaluate performance. Results showed the summarized eleven criteria proposed Bayesian Network approach could aid location selection for Combined heat and power location in the rural area. Supplementary, the created model can support decision-makers to select the best alternatives under different electricity demand variability levels
Methodology to Predict Daily Groundwater Levels by the Implementation of Machine Learning and Crop Models
The continuous decline of groundwater levels caused by variations in climatic conditions and crop water demands is an increased concern for the agricultural community. It is necessary to understand the factors that control these changes in groundwater levels so that we can better address declines and develop improved conservation practices that will lead to a more sustainable use of water. In this study, two machine learning techniques namely support vector regression (SVR) and the nonlinear autoregressive with exogenous inputs (NARX) neural network were implemented to predict daily groundwater levels in a well located in the Mississippi Delta Region (MDR). Results of the NARX model indicate that a Bayesian regularization algorithm with two hidden nodes and 100 time delays was the best architecture to forecast groundwater levels. In another study, the SVR and the NARX model were compared for the prediction of groundwater withdrawal and recharge periods separately. Results from this study showed that input data classified by seasons lead to incremental improvements in the model accuracy, and that the SVR was the most efficient machine learning model with a Mean Squared Error (MSE) of 0.00123 m for the withdrawal season. Analysis of input variables such as previous daily groundwater levels (Gw), precipitation (Pr), and evapotranspiration (ET) showed that the combination of Gw+Pr provides the optimal set for groundwater prediction and that ET degraded the modeling performance, especially during recharge seasons. Finally, the CROPGRO-Soybean crop model was used to simulate the impacts of different volumes of irrigation on the crop height and yield, and to generate the daily irrigation requirements for soybean crops in the MDR. Four irrigation threshold scenarios (20%, 40%, 50% and 60%) were obtained from the CROGRO-Soybean model and used as inputs in the SVR to evaluate the predicted response of daily groundwater levels to different irrigation demands. This study demonstrated that conservative irrigation management, by selecting a low irrigation threshold, can provide good yields comparable to what is produced by a high volume irrigation management practice. Thus, lower irrigation volumes can have a big impact on decreasing the amount of groundwater withdrawals, while still maintaining comparable yields
ASSESSING POTENTIAL ENVIRONMENTAL IMPACTS ACCORDING TO PROBABLE PATTERNS OF SWITCHGRASS ADOPTION IN THE SOUTHEASTERN US
To assess the overall net impact of an emerging technology, life cycle assessment (LCA) must be accompanied by projections of adoption. Diffusion of innovation research provides tools that incorporate economic and social variables to explain and forecast integration of technologies. A switchgrass-to-ethanol case study for the southeastern U.S. is used to demonstrate methods for gauging aggregate environmental effects of an emerging energy technology. Before applying diffusion concepts, breakeven capacities are calculated for land in row crops, hay, pasture and marginal land. Breakeven curves are generated to provide upper bounds to switchgrass adoption over a range of farm-gate prices. The amount and type of land converted to switchgrass provides estimates for the total land use change effects as well as for biomass production and overall impact of the regional switchgrass-to-ethanol system, which is measured by greenhouse gas (GHG) emissions, net fossil energy, and nitrate loss. Maximum switchgrass adoption is assessed within breakeven areas for prices of 100, and 100 Mg -1 , switchgrass is projected to be grown on about 0.8 million hectares of land in row crops and 0.5 million hectares of the other land categories. This area of production translates to 5.4 billion liters of ethanol, which is about 9% of the gasoline consumed annually in the region. Because land use change (LUC) benefits are enhanced by primarily converting row crops to switchgrass, annual carbon dioxide equivalents of GHG emissions are reduced by about 2 billion kg CO 2 e yr-1 . About 20 years are required to reach such a production level even though national mandates are set for 2022. Including projections of behavior in environmental assessments can inform proactive policy measures that optimize effects of emerging energy technologies
Methodology to Predict Daily Groundwater Levels by the Implementation of Machine Learning and Crop Models
The continuous decline of groundwater levels caused by variations in climatic conditions and crop water demands is an increased concern for the agricultural community. It is necessary to understand the factors that control these changes in groundwater levels so that we can better address declines and develop improved conservation practices that will lead to a more sustainable use of water. In this study, two machine learning techniques namely support vector regression (SVR) and the nonlinear autoregressive with exogenous inputs (NARX) neural network were implemented to predict daily groundwater levels in a well located in the Mississippi Delta Region (MDR). Results of the NARX model indicate that a Bayesian regularization algorithm with two hidden nodes and 100 time delays was the best architecture to forecast groundwater levels. In another study, the SVR and the NARX model were compared for the prediction of groundwater withdrawal and recharge periods separately. Results from this study showed that input data classified by seasons lead to incremental improvements in the model accuracy, and that the SVR was the most efficient machine learning model with a Mean Squared Error (MSE) of 0.00123 m for the withdrawal season. Analysis of input variables such as previous daily groundwater levels (Gw), precipitation (Pr), and evapotranspiration (ET) showed that the combination of Gw+Pr provides the optimal set for groundwater prediction and that ET degraded the modeling performance, especially during recharge seasons. Finally, the CROPGRO-Soybean crop model was used to simulate the impacts of different volumes of irrigation on the crop height and yield, and to generate the daily irrigation requirements for soybean crops in the MDR. Four irrigation threshold scenarios (20%, 40%, 50% and 60%) were obtained from the CROGRO-Soybean model and used as inputs in the SVR to evaluate the predicted response of daily groundwater levels to different irrigation demands. This study demonstrated that conservative irrigation management, by selecting a low irrigation threshold, can provide good yields comparable to what is produced by a high volume irrigation management practice. Thus, lower irrigation volumes can have a big impact on decreasing the amount of groundwater withdrawals, while still maintaining comparable yields
Recommended from our members
Embedding expert opinion in a Bayesian network model to predict wheat yield from spring-summer weather
Wheat yield is highly dependent on weather, Therefore, predicting its effect can improve crop management decisions. Various modelling approaches have been used to predict wheat yield including process-based modelling, statistical models, and machine learning. However, these models typically require a large data set for training or fitting. They often also have a limited ability in capturing the effects of small-scale variability, time, and duration of extreme weather events. Here, we develop a Bayesian Network (BN) model by interviewing experts including farmers, embedding their knowledge from years of experience within a quantitative model. These experts identified the period from the beginning of anthesis to the end of grain filling stage as a critical period and maximum temperature, mean temperature and precipitation as key weather variables for inclusion in the BN. To keep the time input from experts manageable, the conditional probability table for the BN was constructed based on their anticipated impact on the mean yield of different weather conditions. The model predicted the yield in the same or neighbouring class (very low, low, medium, high and very high) as the reported yield with low error rate ranging from 9.1 to 15.2% and, when used to estimate the median predicted yield, R2 ranging from 41 to 52%. Interestingly, model successfully predicted the yield in years 1998, 2007, 2012 and 2020 which had the most extreme weather events. Additionally, the more recent data, from 2012 to 2022 was predicted more accurately, especially 2022 season which was not sown yet when eliciting information and recently added to the testing data. Little difference was observed between the predictions made using model parameters based only the opinion of the farm manager from which the test data originated, and the predictions made using the average opinion of a group of 9 experts. The inclusion of causal variables in the model also provided insight into the experts’ rationale, allowing unexpected results to be explored. This methodology provides a means to rapidly develop a successful predictive model of wheat yield with limited (or no) data using expert understanding. This model could be tuned and updated with data as it becomes available
- …