55,251 research outputs found

    evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R

    Get PDF
    Commonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search. Although this approach is known to be an efficient heuristic, the results of recursive tree methods are only locally optimal, as splits are chosen to maximize homogeneity at the next step only. An alternative way to search over the parameter space of trees is to use global optimization methods like evolutionary algorithms. This paper describes the "evtree" package, which implements an evolutionary algorithm for learning globally optimal classification and regression trees in R. Computationally intensive tasks are fully computed in C++ while the "partykit" (Hothorn and Zeileis 2011) package is leveraged for representing the resulting trees in R, providing unified infrastructure for summaries, visualizations, and predictions. "evtree" is compared to "rpart" (Therneau and Atkinson 1997), the open-source CART implementation, and conditional inference trees ("ctree", Hothorn, Hornik, and Zeileis 2006). The usefulness of "evtree" is illustrated in a textbook customer classification task and a benchmark study of predictive accuracy in which "evtree" achieved at least similar and most of the time better results compared to the recursive algorithms "rpart" and "ctree".machine learning, classification trees, regression trees, evolutionary algorithms, R

    Incremental Measurement of Structural Entropy for Dynamic Graphs

    Full text link
    Structural entropy is a metric that measures the amount of information embedded in graph structure data under a strategy of hierarchical abstracting. To measure the structural entropy of a dynamic graph, we need to decode the optimal encoding tree corresponding to the optimal hierarchical community partitioning of the graph. However, the current structural entropy methods do not support efficient incremental updating of encoding trees. To address this issue, we propose Incre-2dSE, a novel incremental measurement framework that dynamically adjusts the community partitioning and efficiently computes the updated structural entropy for each snapshot of dynamic graphs. Incre-2dSE consists of an online module and an offline module. The online module includes dynamic measurement algorithms based on two dynamic adjustment strategies for two-dimensional encoding trees, i.e., the naive adjustment strategy and the node-shifting adjustment strategy, which supports theoretical analysis of the updated structural entropy and incrementally adjusts the community partitioning towards a lower structural entropy. In contrast, the offline module globally constructs the encoding tree for the updated graph using static community detection methods and calculates the structural entropy by definition. We conduct experiments on an artificial dynamic graph dataset generated by Hawkes Process and 3 real-world datasets. Experimental results confirm that our dynamic measurement algorithms effectively capture the dynamic evolution of the communities, reduce time consumption, and provide great interpretability
    • …
    corecore