1,016 research outputs found
GIS-Based landslide susceptibility modeling: a comparison between best-first decision tree and its two ensembles (BagBFT and RFBFT)
This study aimed to explore and compare the application of current state-of-the-art machine learning techniques, including bagging (Bag) and rotation forest (RF), to assess landslide susceptibility with the base classifier best-first decision tree (BFT). The proposed two novel ensemble frameworks, BagBFT and RFBFT, and the base model BFT, were used to model landslide susceptibility in Zhashui County (China), which suffers from landslides. Firstly, we identified 169 landslides through field surveys and image interpretation. Then, a landslide inventory map was built. These 169 historical landslides were randomly classified into two groups: 70% for training data and 30% for validation data. Then, 15 landslide conditioning factors were considered for mapping landslide susceptibility. The three ensemble outputs were estimated with a receiver operating characteristic (ROC) curve and statistical tests, as well as a new approach, the improved frequency ratio accuracy. The areas under the ROC curve (AUCs) for the training data (success rate) of the three algorithms were 0.722 for BFT, 0.869 for BagBFT, and 0.895 for RFBFT. The AUCs for the validating groups (prediction rates) were 0.718, 0.834, and 0.872, respectively. The frequency ratio accuracy of the three models was 0.76163 for the BFT model, 0.92220 for the BagBFT model, and 0.92224 for the RFBFT model. Both BagBFT and RFBFT ensembles can improve the accuracy of the BFT base model, and RFBFT was relatively better. Therefore, the RFBFT model is the most effective approach for the accurate modeling of landslide susceptibility mapping (LSM). All three models can improve the identification of landslide-prone areas, enhance risk management ability, and afford more detailed information for land-use planning and policy setting.National Natural Science Foundation of China | Ref. 41977228Key Research Program of Shaanxi | Ref. 2022SF-33
Model identification and accuracy for estimation of suspended sediment load
In the present study, three widely used modeling approaches: (1) sediment rating curve (SRC) and optimized OSRC, (2) machine learning models (ML) (random forest (RF) and Dagging-RF (DA-RF)) and (3) the semi-physically based soil and water assessment tool (SWAT) are applied to predict suspended sediment load (Qs) at the Talar watershed in Iran. Various graphical and quantitative methods were used to evaluate the goodness of fit. Results indicated that the RF model had the best prediction power in the training phase, while the dagging-RF hybrid algorithm outperformed all other models in the validation phase. The OSRC, RF and DA-RF had ‘very good’ performances based on the NSE in the validation phase, SRC showed ‘good’ performance, while the predicted values using SWAT were ‘satisfactory’. Our results suggest that the OSRC and ML models are more suitable for prediction of Qs in study catchments with poor data availability.</p
Sensitivity analysis and parameter estimation for distributed hydrological modeling: potential of variational methods
Variational methods are widely used for the analysis and control of computationally intensive spatially distributed systems. In particular, the adjoint state method enables a very efficient calculation of the derivatives of an objective function (response function to be analysed or cost function to be optimised) with respect to model inputs. In this contribution, it is shown that the potential of variational methods for distributed catchment scale hydrology should be considered. A distributed flash flood model, coupling kinematic wave overland flow and Green Ampt infiltration, is applied to a small catchment of the Thoré basin and used as a relatively simple (synthetic observations) but didactic application case. It is shown that forward and adjoint sensitivity analysis provide a local but extensive insight on the relation between the assigned model parameters and the simulated hydrological response. Spatially distributed parameter sensitivities can be obtained for a very modest calculation effort (~6 times the computing time of a single model run) and the singular value decomposition (SVD) of the Jacobian matrix provides an interesting perspective for the analysis of the rainfall-runoff relation. For the estimation of model parameters, adjoint-based derivatives were found exceedingly efficient in driving a bound-constrained quasi-Newton algorithm. The reference parameter set is retrieved independently from the optimization initial condition when the very common dimension reduction strategy (i.e. scalar multipliers) is adopted. Furthermore, the sensitivity analysis results suggest that most of the variability in this high-dimensional parameter space can be captured with a few orthogonal directions. A parametrization based on the SVD leading singular vectors was found very promising but should be combined with another regularization strategy in order to prevent overfitting
Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification
Semi-supervised learning (SSL) is a common approach to learning predictive
models using not only labeled examples, but also unlabeled examples. While SSL
for the simple tasks of classification and regression has received a lot of
attention from the research community, this is not properly investigated for
complex prediction tasks with structurally dependent variables. This is the
case of multi-label classification and hierarchical multi-label classification
tasks, which may require additional information, possibly coming from the
underlying distribution in the descriptive space provided by unlabeled
examples, to better face the challenging task of predicting simultaneously
multiple class labels.
In this paper, we investigate this aspect and propose a (hierarchical)
multi-label classification method based on semi-supervised learning of
predictive clustering trees. We also extend the method towards ensemble
learning and propose a method based on the random forest approach. Extensive
experimental evaluation conducted on 23 datasets shows significant advantages
of the proposed method and its extension with respect to their supervised
counterparts. Moreover, the method preserves interpretability and reduces the
time complexity of classical tree-based models
- …