1,016 research outputs found

    GIS-Based landslide susceptibility modeling: a comparison between best-first decision tree and its two ensembles (BagBFT and RFBFT)

    Get PDF
    This study aimed to explore and compare the application of current state-of-the-art machine learning techniques, including bagging (Bag) and rotation forest (RF), to assess landslide susceptibility with the base classifier best-first decision tree (BFT). The proposed two novel ensemble frameworks, BagBFT and RFBFT, and the base model BFT, were used to model landslide susceptibility in Zhashui County (China), which suffers from landslides. Firstly, we identified 169 landslides through field surveys and image interpretation. Then, a landslide inventory map was built. These 169 historical landslides were randomly classified into two groups: 70% for training data and 30% for validation data. Then, 15 landslide conditioning factors were considered for mapping landslide susceptibility. The three ensemble outputs were estimated with a receiver operating characteristic (ROC) curve and statistical tests, as well as a new approach, the improved frequency ratio accuracy. The areas under the ROC curve (AUCs) for the training data (success rate) of the three algorithms were 0.722 for BFT, 0.869 for BagBFT, and 0.895 for RFBFT. The AUCs for the validating groups (prediction rates) were 0.718, 0.834, and 0.872, respectively. The frequency ratio accuracy of the three models was 0.76163 for the BFT model, 0.92220 for the BagBFT model, and 0.92224 for the RFBFT model. Both BagBFT and RFBFT ensembles can improve the accuracy of the BFT base model, and RFBFT was relatively better. Therefore, the RFBFT model is the most effective approach for the accurate modeling of landslide susceptibility mapping (LSM). All three models can improve the identification of landslide-prone areas, enhance risk management ability, and afford more detailed information for land-use planning and policy setting.National Natural Science Foundation of China | Ref. 41977228Key Research Program of Shaanxi | Ref. 2022SF-33

    Model identification and accuracy for estimation of suspended sediment load

    Get PDF
    In the present study, three widely used modeling approaches: (1) sediment rating curve (SRC) and optimized OSRC, (2) machine learning models (ML) (random forest (RF) and Dagging-RF (DA-RF)) and (3) the semi-physically based soil and water assessment tool (SWAT) are applied to predict suspended sediment load (Qs) at the Talar watershed in Iran. Various graphical and quantitative methods were used to evaluate the goodness of fit. Results indicated that the RF model had the best prediction power in the training phase, while the dagging-RF hybrid algorithm outperformed all other models in the validation phase. The OSRC, RF and DA-RF had ‘very good’ performances based on the NSE in the validation phase, SRC showed ‘good’ performance, while the predicted values using SWAT were ‘satisfactory’. Our results suggest that the OSRC and ML models are more suitable for prediction of Qs in study catchments with poor data availability.</p

    Sensitivity analysis and parameter estimation for distributed hydrological modeling: potential of variational methods

    Get PDF
    Variational methods are widely used for the analysis and control of computationally intensive spatially distributed systems. In particular, the adjoint state method enables a very efficient calculation of the derivatives of an objective function (response function to be analysed or cost function to be optimised) with respect to model inputs. In this contribution, it is shown that the potential of variational methods for distributed catchment scale hydrology should be considered. A distributed flash flood model, coupling kinematic wave overland flow and Green Ampt infiltration, is applied to a small catchment of the Thoré basin and used as a relatively simple (synthetic observations) but didactic application case. It is shown that forward and adjoint sensitivity analysis provide a local but extensive insight on the relation between the assigned model parameters and the simulated hydrological response. Spatially distributed parameter sensitivities can be obtained for a very modest calculation effort (~6 times the computing time of a single model run) and the singular value decomposition (SVD) of the Jacobian matrix provides an interesting perspective for the analysis of the rainfall-runoff relation. For the estimation of model parameters, adjoint-based derivatives were found exceedingly efficient in driving a bound-constrained quasi-Newton algorithm. The reference parameter set is retrieved independently from the optimization initial condition when the very common dimension reduction strategy (i.e. scalar multipliers) is adopted. Furthermore, the sensitivity analysis results suggest that most of the variability in this high-dimensional parameter space can be captured with a few orthogonal directions. A parametrization based on the SVD leading singular vectors was found very promising but should be combined with another regularization strategy in order to prevent overfitting

    Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification

    Full text link
    Semi-supervised learning (SSL) is a common approach to learning predictive models using not only labeled examples, but also unlabeled examples. While SSL for the simple tasks of classification and regression has received a lot of attention from the research community, this is not properly investigated for complex prediction tasks with structurally dependent variables. This is the case of multi-label classification and hierarchical multi-label classification tasks, which may require additional information, possibly coming from the underlying distribution in the descriptive space provided by unlabeled examples, to better face the challenging task of predicting simultaneously multiple class labels. In this paper, we investigate this aspect and propose a (hierarchical) multi-label classification method based on semi-supervised learning of predictive clustering trees. We also extend the method towards ensemble learning and propose a method based on the random forest approach. Extensive experimental evaluation conducted on 23 datasets shows significant advantages of the proposed method and its extension with respect to their supervised counterparts. Moreover, the method preserves interpretability and reduces the time complexity of classical tree-based models
    corecore