55,251 research outputs found
evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R
Commonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search. Although this approach is known to be an efficient heuristic, the results of recursive tree methods are only locally optimal, as splits are chosen to maximize homogeneity at the next step only. An alternative way to search over the parameter space of trees is to use global optimization methods like evolutionary algorithms. This paper describes the "evtree" package, which implements an evolutionary algorithm for learning globally optimal classification and regression trees in R. Computationally intensive tasks are fully computed in C++ while the "partykit" (Hothorn and Zeileis 2011) package is leveraged for representing the resulting trees in R, providing unified infrastructure for summaries, visualizations, and predictions. "evtree" is compared to "rpart" (Therneau and Atkinson 1997), the open-source CART implementation, and conditional inference trees ("ctree", Hothorn, Hornik, and Zeileis 2006). The usefulness of "evtree" is illustrated in a textbook customer classification task and a benchmark study of predictive accuracy in which "evtree" achieved at least similar and most of the time better results compared to the recursive algorithms "rpart" and "ctree".machine learning, classification trees, regression trees, evolutionary algorithms, R
Incremental Measurement of Structural Entropy for Dynamic Graphs
Structural entropy is a metric that measures the amount of information
embedded in graph structure data under a strategy of hierarchical abstracting.
To measure the structural entropy of a dynamic graph, we need to decode the
optimal encoding tree corresponding to the optimal hierarchical community
partitioning of the graph. However, the current structural entropy methods do
not support efficient incremental updating of encoding trees. To address this
issue, we propose Incre-2dSE, a novel incremental measurement framework that
dynamically adjusts the community partitioning and efficiently computes the
updated structural entropy for each snapshot of dynamic graphs. Incre-2dSE
consists of an online module and an offline module. The online module includes
dynamic measurement algorithms based on two dynamic adjustment strategies for
two-dimensional encoding trees, i.e., the naive adjustment strategy and the
node-shifting adjustment strategy, which supports theoretical analysis of the
updated structural entropy and incrementally adjusts the community partitioning
towards a lower structural entropy. In contrast, the offline module globally
constructs the encoding tree for the updated graph using static community
detection methods and calculates the structural entropy by definition. We
conduct experiments on an artificial dynamic graph dataset generated by Hawkes
Process and 3 real-world datasets. Experimental results confirm that our
dynamic measurement algorithms effectively capture the dynamic evolution of the
communities, reduce time consumption, and provide great interpretability
Recommended from our members
Survey of partitioning techniques in silicon compilation
In the silicon compilation design process, partitioning is usually the first problem to be investigated because partitioning algorithms form the backbone of many algorithms including: system synthesis, processor synthesis, floorplanning, and placement. In this survey, several partitioning techniques will be examined. In addition, this paper will review the partitioning algorithms used by synthesis systems at different design levels
- …