Search CORE

2 research outputs found

Machine Learning based Models for Fresh Produce Yield and Price Forecasting for Strawberry Fruit

Author: Okwuchi Ifeanyi
Publication venue: 'University of Waterloo'
Publication date: 28/05/2020
Field of study

Building market price forecasting models of Fresh Produce (FP) is crucial to protect retailers and consumers from highly priced FP. However, the task of forecasting FP prices is highly complex due to the very short shelf life of FP, inability to store for long term and external factors like weather and climate change. This forecasting problem has been traditionally modelled as a time series problem. Models for grain yield forecasting and other non-agricultural prices forecasting are common. However, forecasting of FP prices is recent and has not been fully explored. In this thesis, the forecasting models built to fill this void are solely machine learning based which is also a novelty. The growth and success of deep learning, a type of machine learning algorithm, has largely been attributed to the availability of big data and high end computational power. In this thesis, work is done on building several machine learning models (both conventional and deep learning based) to predict future yield and prices of FP (price forecast of strawberries are said to be more difficult than other FP and hence is used here as the main product). The data used in building these prediction models comprises of California weather data, California strawberry yield, California strawberry farm-gate prices and a retailer purchase price data. A comparison of the various prediction models is done based on a new aggregated error measure (AGM) proposed in this thesis which combines mean absolute error, mean squared error and R^2 coefficient of determination. The best two models are found to be an Attention CNN-LSTM (AC-LSTM) and an Attention ConvLSTM (ACV-LSTM). Different stacking ensemble techniques such as voting regressor and stacking with Support vector Regression (SVR) are then utilized to come up with the best prediction. The experiment results show that across the various examined applications, the proposed model which is a stacking ensemble of the AC-LSTM and ACV-LSTM using a linear SVR is the best performing based on the proposed aggregated error measure. To show the robustness of the proposed model, it was used also tested for predicting WTI and Brent crude oil prices and the results proved consistent with that of the FP price prediction

University of Waterloo's Institutional Repository

Deep Learning Based Approaches for Imputation of Time Series Models

Author: Saad Muhammad
Publication venue: 'University of Waterloo'
Publication date: 07/12/2020
Field of study

Market price forecasting models for Fresh Produce (FP) are crucial to protect retailers and consumers from highly priced FP. However, utilizing the data for forecasting is obstructed by the occurrence of missing values. Therefore, it is imperative to develop models to determine the value for those missing instances thereby enabling effective forecasting. Usually this problem is tackled with conventional methods that introduce bias into the system which in turn results in unreliable forecasting results. Therefore, in this thesis, numerous imputation models are developed alongside a framework enabling the user to impute any time series data with the optimal models. This thesis also develops novel forecasting models which are used as a gauging mechanism for each tested imputation mode. However, those forecasting models can also be used as standalone models. The growth and success of deep learning has largely been attributed to the availability of big data and high end computational power along with the theoretical advancement . In this thesis, multiple deep learning models are built for imputing the missing values and also for forecasting. The data used in building these deep learning models comprise California weather data, California strawberry yield, California strawberry farm-gate prices, USA corn yield data, Brent oil type daily prices and a synthetic time series dataset. For imputation, mean squared error is used as an metric to gauge the performance of imputation whereas for forecasting a new aggregated error measure (AGM) is proposed in this thesis which combines mean absolute error, mean squared error and R2 which is the coefficient of determination. Different models are found to be optimal for different time series. These models are illustrated in the recommendation framework developed in the thesis. Different stacking ensemble techniques such as voting regressor and stacking ML ensemble are then utilized to have better imputation results. The experiments show that the voting regressor yields the best imputation results. To gauge the robustness of the imputation framework, different time series are assessed. The imputed data is used for forecasting and the forecasting results are compared with market deep and non-deep learning models. The results show the best imputation models recommended based on work with the synthesized datasets are in fact the best for the tested incomplete real datasets with Mean Absolute Scaled Error (MASE) <1 i.e. better than the naive forecasting model. Also, it is found that the best imputation models have higher impact on reducing the forecasting errors compared to other deep or non-deep imputation models found in literature and market

University of Waterloo's Institutional Repository