538 research outputs found

    Hybrid machine learning approach for gully erosion mapping susceptibility at a watershed scale

    Get PDF
    Gully erosion is a serious threat to the state of ecosystems all around the world. As a result, safeguarding the soil for our own benefit and from our own actions is a must for guaranteeing the longterm viability of a variety of ecosystem services. As a result, developing gully erosion susceptibility maps (GESM) is both suggested and necessary. In this study, we compared the effectiveness of three hybrid machine learning (ML) algorithms with the bivariate statistical index frequency ratio (FR), named random forest-frequency ratio (RF-FR), support vector machine-frequency ratio (SVM-FR), and naïve Bayes-frequency ratio (NB-FR), in mapping gully erosion in the GHISS watershed in the northern part of Morocco. The models were implemented based on the inventory mapping of a total number of 178 gully erosion points randomly divided into 2 groups (70% of points were used for training the models and 30% of points were used for the validation process), and 12 conditioning variables (i.e., elevation, slope, aspect, plane curvature, topographic moisture index (TWI), stream power index (SPI), precipitation, distance to road, distance to stream, drainage density, land use, and lithology). Using the equal interval reclassification method, the spatial distribution of gully erosion was categorized into five different classes, including very high, high, moderate, low, and very low. Our results showed that the very high susceptibility classes derived using RF-FR, SVM-FR, and NB-FR models covered 25.98%, 22.62%, and 27.10% of the total area, respectively. The area under the receiver (AUC) operating characteristic curve, precision, and accuracy were employed to evaluate the performance of these models. Based on the receiver operating characteristic (ROC), the results showed that the RF-FR achieved the best performance (AUC = 0.91), followed by SVM-FR (AUC = 0.87), and then NB-FR (AUC = 0.82), respectively. Our contribution, in line with the Sustainable Development Goals (SDGs), plays a crucial role for understanding and identifying the issue of “where and why” gully erosion occurs, and hence it can serve as a first pathway to reducing gully erosion in this particular area

    Head-cut gully erosion susceptibility mapping in semi-arid region using machine learning methods: insight from the high atlas, Morocco

    Get PDF
    Gully erosion has been identified in recent decades as a global threat to people and property. This problem also affects the socioeconomic stability of societies and therefore limits their sustainable development, as it impacts a nonrenewable resource on a human scale, namely, soil. The focus of this study is to evaluate the prediction performance of four machine learning (ML) models: Logistic Regression (LR), classification and regression tree (CART), Linear Discriminate Analysis (LDA), and the k-Nearest Neighbors (kNN), which are novel approaches in gully erosion modeling research, particularly in semi-arid regions with a mountainous character. 204 samples of erosion areas and 204 samples of non-erosion areas were collected through field surveys and high-resolution satellite images, and 17 significant factors were considered. The dataset cells of samples (70% for training and 30% for testing) were randomly prepared to assess the robustness of the different models. The functional relevance between soil erosion and effective factors was computed using the ML models. The ML models were evaluated using different metrics, including accuracy, the kappa coefficient. kNN is the ideal model for this study. The value of the AUC from ROC considering the testing datasets of KNN is 0.93; the remaining models are associated to ideal AUC and are similar to kNN in terms of values. The AUC values from ROC of GLM, LDA, and CART for testing datasets are 0.90, 0.91, and 0.84, respectively. The value of accuracy considering the validation datasets of LDA, CART, KNN, and GLM are 0.85, 0.82, 0.89, 0.84 respectively. The values of Kappa of LDA, CART, and GLM for testing datasets are 0.70, 0.65, and 0.68, respectively. ML models, in particular KNN, GLM, and LDA, have achieved outstanding results in terms of creating soil erosion susceptibility maps. The maps created with the most reliable models could be a useful tool for sustainable management, watershed conservation and prevention of soil and water losses.info:eu-repo/semantics/publishedVersio

    A novel ensemble artificial intelligence approach for gully erosion mapping in a semi-arid watershed (Iran)

    Get PDF
    © 2019 by the authors. Licensee MDPI, Basel, Switzerland. In this study, we introduced a novel hybrid artificial intelligence approach of rotation forest (RF) as a Meta/ensemble classifier based on alternating decision tree (ADTree) as a base classifier called RF-ADTree in order to spatially predict gully erosion at Klocheh watershed of Kurdistan province, Iran. A total of 915 gully erosion locations along with 22 gully conditioning factors were used to construct a database. Some soft computing benchmark models (SCBM) including the ADTree, the Support Vector Machine by two kernel functions such as Polynomial and Radial Base Function (SVM-Polynomial and SVM-RBF), the Logistic Regression (LR), and the Naïve Bayes Multinomial Updatable (NBMU) models were used for comparison of the designed model. Results indicated that 19 conditioning factors were effective among which distance to river, geomorphology, land use, hydrological group, lithology and slope angle were the most remarkable factors for gully modeling process. Additionally, results of modeling concluded the RF-ADTree ensemble model could significantly improve (area under the curve (AUC) = 0.906) the prediction accuracy of the ADTree model (AUC = 0.882). The new proposed model had also the highest performance (AUC = 0.913) in comparison to the SVM-Polynomial model (AUC = 0.879), the SVM-RBF model (AUC = 0.867), the LR model (AUC = 0.75), the ADTree model (AUC = 0.861) and the NBMU model (AUC = 0.811)

    Grapes Quality Prediction Using Iot & Machine Learning Based on Pre Harvesting

    Get PDF
    Minimizing pesticide use, preserving water, as well as enhancing soil health are just a few of the sustainable farming techniques that must be carefully considered while growing grapes of a high calibre. These practices can help preserve the environment and ensure the longevity of the vineyard. However, it is difficult for the farmers to find the suitability of the soil and its environment to cultivate grapes with high quality. Thus this research aims to evaluate the fitness of the soil for the fitness of growing quality grapes with the aid of machine learning algorithm. The research was done on Nasik region which is called as the “Grape Capital of India” situated in Maharashtra. Total of 154 villages were considered for the examination and soil specimens were collected and sent to the government testing lab in Maharashtra. The soil characteristics by considering both micro and macro nutrients, and the water characteristics were obtained from the lab. Also the climatic features, quality of the petiole and fruit characteristics were included for creating the dataset. These data was given to six different machine learning algorithm to classify the soil by defining whether the soil is fit for grapes or not. Moreover, this research proposed to analyze the correlation between the nutrients by which the relationship and dependency between the different nutrients and features were considered for defining the grapes quality. Also both the micro and macro nutrients were given equal importance in defining the soil quality suitable for obtaining high quality grapes. Based on the results obtained, Pimpalas Ramche contains more nutrients for the grape to grow more successfully based on samples gathered from different vine yards and the decision tree classifier scores better than any other classifiers among the machine learning algorithms employed in terms of accuracy

    Gis-based gully erosion susceptibility mapping: a comparison of computational ensemble data mining models

    Get PDF
    Gully erosion destroys agricultural and domestic grazing land in many countries, especially those with arid and semi-arid climates and easily eroded rocks and soils. It also generates large amounts of sediment that can adversely impact downstream river channels. The main objective of this research is to accurately detect and predict areas prone to gully erosion. In this paper, we couple hybrid models of a commonly used base classifier (reduced pruning error tree, REPTree) with AdaBoost (AB), bagging (Bag), and random subspace (RS) algorithms to create gully erosion susceptibility maps for a sub-basin of the Shoor River watershed in northwestern Iran. We compare the performance of these models in terms of their ability to predict gully erosion and discuss their potential use in other arid and semi-arid areas. Our database comprises 242 gully erosion locations, which we randomly divided into training and testing sets with a ratio of 70/30. Based on expert knowledge and analysis of aerial photographs and satellite images, we selected 12 conditioning factors for gully erosion. We used multi-collinearity statistical techniques in the modeling process, and checked model performance using statistical indexes including precision, recall, F-measure, Matthew correlation coefficient (MCC), receiver operatic characteristic curve (ROC), precision-recall graph (PRC), Kappa, root mean square error (RMSE), relative absolute error (PRSE), mean absolute error (MAE), and relative absolute error (RAE). Results show that rainfall, elevation, and river density are the most important factors for gully erosion susceptibility mapping in the study area. All three hybrid models that we tested significantly enhanced and improved the predictive power of REPTree (AUC=0.800), but the RS-REPTree (AUC= 0.860) ensemble model outperformed the Bag-REPTree (AUC= 0.841) and the AB-REPTree (AUC= 0.805) models. We suggest that decision makers, planners, and environmental engineers employ the RS-REPTree hybrid model to better manage gully erosion-prone areas in Iran

    GIS-based landslide susceptibility modeling using data mining techniques

    Get PDF
    Introduction: Landslide is one of the most widespread geohazards around the world. Therefore, it is necessary and meaningful to map regional landslide susceptibility for landslide mitigation. In this research, landslide susceptibility maps were produced by four models, namely, certainty factors (CF), naive Bayes (NB), J48 decision tree (J48), and multilayer perceptron (MLP) models.Methods: In the first step, 328 landslides were identified via historical data, interpretation of remote sensing images, and field investigation, and they were divided into two subsets that were assigned different uses: 70% subset for training and 30% subset for validating. Then, twelve conditioning factors were employed, namely, altitude, slope angle, slope aspect, plan curvature, profile curvature, TWI, NDVI, distance to rivers, distance to roads, land use, soil, and lithology. Later, the importance of each conditioning factor was analyzed by average merit (AM) values, and the relationship between landslide occurrence and various factors was evaluated using the certainty factor (CF) approach. In the next step, the landslide susceptibility maps were produced based on four models, and the effect of the four models were quantitatively compared by receiver operating characteristic (ROC) curves, area under curve (AUC) values, and non-parametric tests.Results: The results demonstrated that all the four models can reasonably assess landslide susceptibility. Of these four models, the CF model has the best predictive performance for the training (AUC=0.901) and validating data (AUC=0.892).Discussion: The proposed approach is an innovative method that may also help other scientists to develop landslide susceptibility maps in other areas and that could be used for geo-environmental problems besides natural hazard assessments

    A Novel Hybrid Machine Learning-Based Model for Rockfall Source Identification in Presence of Other Landslide Types Using LiDAR and GIS

    Full text link
    © 2019, King Abdulaziz University and Springer Nature Switzerland AG. Abstract: Rockfall is a common phenomenon in mountainous and hilly areas worldwide, including Malaysia. Rockfall source identification is a challenging task in rockfall hazard assessment. The difficulty rise when the area of interest has other landslide types with nearly similar controlling factors. Therefore, this research presented and assessed a hybrid model for rockfall source identification based on the stacking ensemble model of random forest (RF), artificial neural network, Naive Bayes (NB), and logistic regression in addition to Gaussian mixture model (GMM) using high-resolution airborne laser scanning data (LiDAR). GMM was adopted to automatically compute the thresholds of slope angle for various landslide types. Chi square was utilised to rank and select the conditioning factors for each landslide type. The best fit ensemble model (RF–NB) was then used to produce probability maps, which were used to conduct rockfall source identification in combination with the reclassified slope raster based on the thresholds obtained by the GMM. Next, landslide potential area was structured to reduce the sensitivity and the noise of the model to the variations in different conditioning factors for improving its computation performance. The accuracy assessment of the developed model indicates that the model can efficiently identify probable rockfall sources with receiver operating characteristic curve accuracies of 0.945 and 0.923 on validation and training datasets, respectively. In general, the proposed hybrid model is an effective model for rockfall source identification in the presence of other landslide types with a reasonable generalisation performance. Graphic Abstract: [Figure not available: see fulltext.]

    Conditioning factor determination for mapping and prediction of landslide susceptibility using machine learning algorithms

    Full text link
    © 2019 SPIE. Landslides are type of natural geohazard interfering with many economical and social activities and causing serious damages on human life. It is ranked as a great disaster, threatening life, property and environment. Therefore, early prediction of landslide prone areas is vital. Variety of causative factors such as glaciers melting, excessive raining, mining, volcanic activities, active faults, earthquake, logging, erosion, urbanization, construction, and other human activities can trigger landslide occurrence. Then, identification of factors that directly influences the slide events is highly in demand. Some topographical, geological, and hydrological datasets (e.g., slope, aspect, geology, terrain roughness, vegetation index, distance to stream, distance to road, distance to fault, land use, precipitation, profile curvature, plan curvature) are considered to be effective conditioning factors. However, the importance of each factor differs from one study to another. This study investigates the effectiveness of four sets of landslide conditioning variable(s). Fourteen landslide conditioning variables were considered in this study where they were duly divided into four groups G1, G2, G3, and G4. Three machine learning algorithms namely, Random Forest (RF), Naive Bayes (NB), and Boosted Logistic Regression (LogitBoost) were constructed based on each dataset in order to determine which set would be more suitable for landslide susceptibility prediction. In total, 227 landslide inventory datasets of the study area were used where 70% was used for training and 30% for testing. To this end, in the present research, the two main objectives were: 1) Investigation on effectiveness of 14 landslides conditioning factors (altitude, slope, aspect, total curvature, profile curvature, plan curvature, Stream Power Index (SPI), Topographic Wetness Index (TWI), Terrain Roughness Index (TRI), distance to fault, distance to road, distance to stream, land use, and geology) by analyzing and determining the most important factors using variance-inflated factor (VIF), Pearson's correlation and Chi-square techniques. Consequently, 4 categories of datasets were defined; first dataset included all 14 conditioning factors, second dataset included Digital Elevation Models (DEM) derivatives (morphometrice factors), third dataset was only based on 5 factors namely lithology, land use, distance to stream, distance to road, and distance to fault, and last dataset was included 8 factors selected using factor analysis and optimization. 2) Evaluate the sensitivity of each modeling technique (NB, RF and LogitBoost) to different conditioning factors using the area under curve (AUC). Eventually, RF technique using optimized variables (G4) performed well with AUC of 0.940 followed by LogitBoost (0.898) and NB (0.864)
    corecore