2 research outputs found

    Estimating Soil Available Phosphorus Content through Coupled Wavelet–Data-Driven Models

    No full text
    Soil phosphorus (P) is a vital but limited element which is usually leached from the soil via the drainage process. Soil phosphorus as a soluble substance can be delivered through agricultural fields by runoff or soil loss. It is one of the most essential nutrients that affect the sustainability of crops as well as the energy transfer for living organisms. Therefore, an accurate simulation of soil phosphorus, which is considered as a point source pollutant in elevated contents, must be performed. Considering a crucial issue for a sustainable soil and water management, an effective soil phosphorus assessment in the current research was conducted with the aim of examining the capability of five different wavelet-based data-driven models: gene expression programming (GEP), neural networks (NN), random forest (RF), multivariate adaptive regression spline (MARS), and support vector machine (SVM) in modeling soil phosphorus (P). In order to achieve this goal, several parameters, including soil pH, organic carbon (OC), clay content, and soil P data, were collected from different regions of the Neyshabur plain, Khorasan-e-Razavi Province (Northeast Iran). First, a discrete wavelet transform (DWT) was applied to the pH, OC, and clay as the inputs and their subcomponents were utilized in the applied data-driven techniques. Statistical Gamma test was also used for identifying which effective soil parameter is able to influence soil P. The applied methods were assessed through 10-fold cross-validation scenarios. Our results demonstrated that the wavelet–GEP (WGEP) model outperformed the other models with respect to various validations, such as correlation coefficient (R), scatter index (SI), and Nash–Sutcliffe coefficient (NS) criteria. The GEP model improved the accuracy of the MARS, RF, SVM, and NN models with respect to SI-NS (By comparing the SI values of the GEP model with other models namely MARS, RF, SVM, and NN, the outputs of GEP showed more accuracy by 35%, 30%, 40%, 50%, respectively. Similarly, the results of the GEP outperformed the other models by 3.1%, 2.3%, 4.3%, and 7.6%, comparing their NS values.) by 35%-3.1%, 30%-2.3%, 40%-4.3%, and 50%-7.6%, respectively

    A Novel Stacked Long Short-Term Memory Approach of Deep Learning for Streamflow Simulation

    No full text
    Rainfall-Runoff simulation is the backbone of all hydrological and climate change studies. This study proposes a novel stochastic model for daily rainfall-runoff simulation called Stacked Long Short-Term Memory (SLSTM) relying on machine learning technology. The SLSTM model utilizes only the rainfall-runoff data in its modelling approach and the hydrology system is deemed a blackbox. Conversely, the distributed and physically-based hydrological models, e.g., SWAT (Soil and Water Assessment Tool) preserve the physical aspect of hydrological variables and their inter-relations while taking a wide range of data. The two model types provide specific applications that interest modelers, who can apply them according to their project specification and objectives. However, sparse distribution of point-data may hinder physical models’ performance, which may not be the case in data-driven models. This study proposes a specific SLSTM model and investigates the SLSTM and SWAT models’ data dependency in terms of their spatial distribution. The study was conducted in the two distinct river basins of Samarahan and Trusan, Malaysia, with over 20 years of hydro-climate data. The Trusan basin’s rain gauges are scattered downstream of the basin outlet and Samarahan’s are located around the basin, with one station within each basin’s limits. The SWAT was developed and calibrated following its general modelling approach, however, the SLSTM performance was also tested using data preprocessing with principal component analysis (PCA). Results showed that the SWAT performance for daily streamflow simulation at Samarahan has been superior to that of Trusan. Both the SLSTM and PCA-SLSTM models, however, showed better performance at Trusan with PCA-SLSTM outperforming the SLSTM. This demonstrates that the SWAT model is greatly affected by the spatial distribution of its input data, while data-driven models, irrespective of the spatial distribution of their entry data, can perform well if the data adequacy condition is met. However, considering the structural difference between the two models, each has its specific application in a water resources context. The study of catchments’ response to changes in the hydrology cycle requires a physically-based model like SWAT with proper spatial and temporal distribution of its entry data. However, the study of a specific phenomenon without considering the underlying processes can be done using data-driven models like SLSTM, where improper spatial distribution of data cannot be a restricting factor