Search CORE

906,289 research outputs found

Statistical Sources of Variable Selection Bias in Classification Tree Algorithms Based on the Gini Index

Author: Strobl Carolin
Publication venue
Publication date: 01/01/2005
Field of study

Evidence for variable selection bias in classification tree algorithms based on the Gini Index is reviewed from the literature and embedded into a broader explanatory scheme: Variable selection bias in classification tree algorithms based on the Gini Index can be caused not only by the statistical effect of multiple comparisons, but also by an increasing estimation bias and variance of the splitting criterion when plug-in estimates of entropy measures like the Gini Index are employed. The relevance of these sources of variable selection bias in the different simulation study designs is examined. Variable selection bias due to the explored sources applies to all classification tree algorithms based on empirical entropy measures like the Gini Index, Deviance and Information Gain, and to both binary and multiway splitting algorithms

Open Access LMU ( Ludwig-Maximilians-Univ. München)

EconStor (ZBW Kiel)

Context Tree Selection: A Unifying View

Author: A. Garivier
Barron
Bejerano
Breiman
Busch
Bühlmann
Cesa-Bianchi
Comets
Csiszár
Csiszár
Csiszár
Dedecker
Duarte
F. Leonardi
Fernández
Galves
Galves
Garivier
Leonardi
Massart
Neveu
Rissanen
Willems
Publication venue
Publication date: 01/01/2011
Field of study

The present paper investigates non-asymptotic properties of two popular procedures of context tree (or Variable Length Markov Chains) estimation: Rissanen's algorithm Context and the Penalized Maximum Likelihood criterion. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning overestimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The underestimation properties rely on loss-of-memory and separation conditions of the process. These results improve and generalize the bounds obtained previously. Context tree models have been introduced by Rissanen as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

RCAAP - Repositório Científico de Acesso Aberto de Portugal

HAL-INSA Toulouse

Repositório da Produção USP (Univ. de São Paulo)

Using Avida to test the effects of natural selection on phylogenetic reconstruction methods

Author: Hagstrom George I.
Hang Dehua H.
Ofria Charles
Torng Eric
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2004
Field of study

Phylogenetic trees group organisms by their ancestral relationships. There are a number of distinct algorithms used to reconstruct these trees from molecular sequence data, but different methods sometimes give conflicting results. Since there are few precisely known phylogenies, simulations are typically used to test the quality of reconstruction algorithms. These simulations randomly evolve strings of symbols to produce a tree, and then the algorithms are run with the tree leaves as inputs. Here we use Avida to test two widely used reconstruction methods, which gives us the chance to observe the effect of natural selection on tree reconstruction. We find that if the organisms undergo natural selection between branch points, the methods will be successful even on very large time scales. However, these algorithms often falter when selection is absent

CiteSeerX

Caltech Authors

Household Tree Planting in Tigrai, Northern Ethiopia: Tree Species, Purposes, and Determinants

Author: Gebreegziabher Zenebe
Kassie Menale
Köhlin Gunnar
Mekonnen Alemu
Publication venue
Publication date
Field of study

Trees have multiple purposes in rural Ethiopia, providing significant economic and ecological benefits. Planting trees supplies rural households with wood products for their own consumption, as well for sale, and decreases soil degradation. We used cross-sectional household-level data to analyze the determinants of household tree planting and explored the most important tree attributes or purpose(s) that enhance the propensity to plant trees. We set up a sample selection framework that simultaneously took into account the two decisions of tree growers (whether or not to plant trees and how many) to analyze the determinants of tree planting. We used logistic regression to analyze the most important tree attributes that contribute to households’ tree-planting decisions. We found that land size, age, gender, tenure security, education, exogenous income, and agro-ecology increased both the propensity to plant trees and the amount of tree planting, while increased livestock holding impacted both decisions negatively. Our findings also suggested that households consider a number of attributes in making the decision to plant trees. These results can be used by policymakers to promote tree planting in the study area by trengthening tenure security and considering households’ selection of specific tree species for their attributes.tree plantin, tree species, tree attributes or purposes, sample selection, Tigrai, Ethiopia

Research Papers in Economics

Forest Garrote

Author: Meinshausen Nicolai
Publication venue
Publication date: 01/01/2009
Field of study

Variable selection for high-dimensional linear models has received a lot of attention lately, mostly in the context of l1-regularization. Part of the attraction is the variable selection effect: parsimonious models are obtained, which are very suitable for interpretation. In terms of predictive power, however, these regularized linear models are often slightly inferior to machine learning procedures like tree ensembles. Tree ensembles, on the other hand, lack usually a formal way of variable selection and are difficult to visualize. A Garrote-style convex penalty for trees ensembles, in particular Random Forests, is proposed. The penalty selects functional groups of nodes in the trees. These could be as simple as monotone functions of individual predictor variables. This yields a parsimonious function fit, which lends itself easily to visualization and interpretation. The predictive power is maintained at least at the same level as the original tree ensemble. A key feature of the method is that, once a tree ensemble is fitted, no further tuning parameter needs to be selected. The empirical performance is demonstrated on a wide array of datasets.Comment: 16 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Oxford University Research Archive

Tree Selection

Author: Clemson University. Cooperative Extension Service
Publication venue
Publication date: 01/01/1999
Field of study

The Home & Garden Information Center provides research-based information on landscaping, gardening, plant health, household pests, food safety & preservation, and nutrition, physical activity & health. This HGIC fact sheet provides information on tree selection

South Carolina State Documents Depository

Grading of parameters for urban tree inventories by city officials, arborists and academics using the Delphi method

Author: Busse Nielsen Anders
Delshammar Tim
Wiström Björn
Östberg Johan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Tree inventories are expensive to conduct and update, so every inventory carried out must be maximized. However, increasing the number of constituent parameters increases the cost of performing and updating the inventory, illustrating the need for careful parameter selection. This paper reports the results of a systematic expert rating of tree inventories aiming to quantify the relative importance of each parameter. Using the Delphi method, panels comprising city officials, arborists and academics rated a total of 148 parameters. In order of total mean score, the top ranking parameters, which can serve as a guide for decision-making at practical level and for standardization of tree inventories, were: Scientific name of the tree species and genera, Vitality, Coordinates, Hazard class and Identification number. The study also examined whether the different responsibilities and usage of urban tree databases among organizations and people engaged in urban tree inventories affected their prioritization. The results revealed noticeable dissimilarities in the ranking of parameters between the panels, underlining the need for collaboration between the research community and those commissioning, administrating and conducting inventories. Only by applying such a transdisciplinary approach to parameter selection can urban tree inventories be strengthened and made more relevant

Epsilon Open Archive

Using longitudinal survival probabilities to test field vigour estimates in sugar maple (Acer saccharum Marsh.)

Author: Bigler
Bigler
Bigler
Boulet
Brooks
Brown
Burnham
Bédard
Cherubini
Christian Messier
Davis
Davis
DellaSala
Dobbertin
Engelmann
Fitzgerald
Franklin
Fritts
Gross
Hamilton
Harrell
Hawkes
Heck
Henrik Hartmann
Hogg
Holmes
Hosmer
Innes
Kienholz
Kobe
Kobe
Kraft
Lichtenthaler
Manel
Manion
Manion
Marilou Beaudet
McLaughlin
Millers
Monserud
Ogle
Oliver
Ouellet
Pacala
Pedersen
Pedersen
Quinn
Robitaille
Rosznyay
Schmidt
Smiley
Smith
Somers
Suarez
Swets
Tardif
Torelli
Waring
Waring
Yoda
Publication venue
Publication date: 01/01/2008
Field of study

Tree mortality is a major force driving forest dynamics. To foresters, however, tree mortality is often considered a loss in productivity. To reduce tree mortality, silvicultural systems, such as selection cuts, aim at removing trees that are more likely to die. In order to identify trees with higher risks of mortality, field classifications are employed that assess vigour based on external characteristics of trees. We used a novel longitudinal approach for estimating survival probabilities based on ring-width measurements, initially developed by Bigler and Bugmann [Bigler, C., Bugmann, H., 2004. Predicting the time of tree death using dendrochronological data. Ecol. Appl. 14 (3), 902-914], to parameterize a survival probability model for sugar maple (Acer saccharum Marsh.) and to test whether field-assessed tree vigour classes are corroborated by survival probabilities determined from radial growth history. Data from 56 dead and 321 live sugar maples were collected in stands in western Quebec (Canada) that had undergone a selection cut ≈10 years prior to sampling. Our results showed that tree vigour established from external defects and pathological symptoms, using the classification of Boulet [Boulet, B., 2005. D

Crossref

Archipel - Université du Québec à Montréal

MPG.PuRe