906,289 research outputs found

    Statistical Sources of Variable Selection Bias in Classification Tree Algorithms Based on the Gini Index

    Get PDF
    Evidence for variable selection bias in classification tree algorithms based on the Gini Index is reviewed from the literature and embedded into a broader explanatory scheme: Variable selection bias in classification tree algorithms based on the Gini Index can be caused not only by the statistical effect of multiple comparisons, but also by an increasing estimation bias and variance of the splitting criterion when plug-in estimates of entropy measures like the Gini Index are employed. The relevance of these sources of variable selection bias in the different simulation study designs is examined. Variable selection bias due to the explored sources applies to all classification tree algorithms based on empirical entropy measures like the Gini Index, Deviance and Information Gain, and to both binary and multiway splitting algorithms

    Context Tree Selection: A Unifying View

    Get PDF
    The present paper investigates non-asymptotic properties of two popular procedures of context tree (or Variable Length Markov Chains) estimation: Rissanen's algorithm Context and the Penalized Maximum Likelihood criterion. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning overestimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The underestimation properties rely on loss-of-memory and separation conditions of the process. These results improve and generalize the bounds obtained previously. Context tree models have been introduced by Rissanen as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics

    Using Avida to test the effects of natural selection on phylogenetic reconstruction methods

    Get PDF
    Phylogenetic trees group organisms by their ancestral relationships. There are a number of distinct algorithms used to reconstruct these trees from molecular sequence data, but different methods sometimes give conflicting results. Since there are few precisely known phylogenies, simulations are typically used to test the quality of reconstruction algorithms. These simulations randomly evolve strings of symbols to produce a tree, and then the algorithms are run with the tree leaves as inputs. Here we use Avida to test two widely used reconstruction methods, which gives us the chance to observe the effect of natural selection on tree reconstruction. We find that if the organisms undergo natural selection between branch points, the methods will be successful even on very large time scales. However, these algorithms often falter when selection is absent

    Household Tree Planting in Tigrai, Northern Ethiopia: Tree Species, Purposes, and Determinants

    Get PDF
    Trees have multiple purposes in rural Ethiopia, providing significant economic and ecological benefits. Planting trees supplies rural households with wood products for their own consumption, as well for sale, and decreases soil degradation. We used cross-sectional household-level data to analyze the determinants of household tree planting and explored the most important tree attributes or purpose(s) that enhance the propensity to plant trees. We set up a sample selection framework that simultaneously took into account the two decisions of tree growers (whether or not to plant trees and how many) to analyze the determinants of tree planting. We used logistic regression to analyze the most important tree attributes that contribute to households’ tree-planting decisions. We found that land size, age, gender, tenure security, education, exogenous income, and agro-ecology increased both the propensity to plant trees and the amount of tree planting, while increased livestock holding impacted both decisions negatively. Our findings also suggested that households consider a number of attributes in making the decision to plant trees. These results can be used by policymakers to promote tree planting in the study area by trengthening tenure security and considering households’ selection of specific tree species for their attributes.tree plantin, tree species, tree attributes or purposes, sample selection, Tigrai, Ethiopia

    Forest Garrote

    Full text link
    Variable selection for high-dimensional linear models has received a lot of attention lately, mostly in the context of l1-regularization. Part of the attraction is the variable selection effect: parsimonious models are obtained, which are very suitable for interpretation. In terms of predictive power, however, these regularized linear models are often slightly inferior to machine learning procedures like tree ensembles. Tree ensembles, on the other hand, lack usually a formal way of variable selection and are difficult to visualize. A Garrote-style convex penalty for trees ensembles, in particular Random Forests, is proposed. The penalty selects functional groups of nodes in the trees. These could be as simple as monotone functions of individual predictor variables. This yields a parsimonious function fit, which lends itself easily to visualization and interpretation. The predictive power is maintained at least at the same level as the original tree ensemble. A key feature of the method is that, once a tree ensemble is fitted, no further tuning parameter needs to be selected. The empirical performance is demonstrated on a wide array of datasets.Comment: 16 pages, 3 figure

    Tree Selection

    Get PDF
    The Home & Garden Information Center provides research-based information on landscaping, gardening, plant health, household pests, food safety & preservation, and nutrition, physical activity & health. This HGIC fact sheet provides information on tree selection

    Grading of parameters for urban tree inventories by city officials, arborists and academics using the Delphi method

    Get PDF
    Tree inventories are expensive to conduct and update, so every inventory carried out must be maximized. However, increasing the number of constituent parameters increases the cost of performing and updating the inventory, illustrating the need for careful parameter selection. This paper reports the results of a systematic expert rating of tree inventories aiming to quantify the relative importance of each parameter. Using the Delphi method, panels comprising city officials, arborists and academics rated a total of 148 parameters. In order of total mean score, the top ranking parameters, which can serve as a guide for decision-making at practical level and for standardization of tree inventories, were: Scientific name of the tree species and genera, Vitality, Coordinates, Hazard class and Identification number. The study also examined whether the different responsibilities and usage of urban tree databases among organizations and people engaged in urban tree inventories affected their prioritization. The results revealed noticeable dissimilarities in the ranking of parameters between the panels, underlining the need for collaboration between the research community and those commissioning, administrating and conducting inventories. Only by applying such a transdisciplinary approach to parameter selection can urban tree inventories be strengthened and made more relevant

    Using longitudinal survival probabilities to test field vigour estimates in sugar maple (Acer saccharum Marsh.)

    Get PDF
    Tree mortality is a major force driving forest dynamics. To foresters, however, tree mortality is often considered a loss in productivity. To reduce tree mortality, silvicultural systems, such as selection cuts, aim at removing trees that are more likely to die. In order to identify trees with higher risks of mortality, field classifications are employed that assess vigour based on external characteristics of trees. We used a novel longitudinal approach for estimating survival probabilities based on ring-width measurements, initially developed by Bigler and Bugmann [Bigler, C., Bugmann, H., 2004. Predicting the time of tree death using dendrochronological data. Ecol. Appl. 14 (3), 902-914], to parameterize a survival probability model for sugar maple (Acer saccharum Marsh.) and to test whether field-assessed tree vigour classes are corroborated by survival probabilities determined from radial growth history. Data from 56 dead and 321 live sugar maples were collected in stands in western Quebec (Canada) that had undergone a selection cut ≈10 years prior to sampling. Our results showed that tree vigour established from external defects and pathological symptoms, using the classification of Boulet [Boulet, B., 2005. D
    • 

    corecore