906,289 research outputs found
Statistical Sources of Variable Selection Bias in Classification Tree Algorithms Based on the Gini Index
Evidence for variable selection bias in classification tree algorithms based on the Gini Index is reviewed from the literature and embedded into a broader explanatory scheme: Variable selection bias in classification tree algorithms based on the Gini Index can be caused not only by the statistical effect of multiple comparisons, but also by an increasing estimation bias and variance of the splitting criterion when plug-in estimates of entropy measures like the Gini Index are employed. The relevance of these sources of variable selection bias in the different simulation study designs is examined. Variable selection bias due to the explored sources applies to all classification tree algorithms based on empirical entropy measures like the Gini Index, Deviance and Information Gain, and to both binary and multiway splitting algorithms
Context Tree Selection: A Unifying View
The present paper investigates non-asymptotic properties of two popular
procedures of context tree (or Variable Length Markov Chains) estimation:
Rissanen's algorithm Context and the Penalized Maximum Likelihood criterion.
First showing how they are related, we prove finite horizon bounds for the
probability of over- and under-estimation. Concerning overestimation, no
boundedness or loss-of-memory conditions are required: the proof relies on new
deviation inequalities for empirical probabilities of independent interest. The
underestimation properties rely on loss-of-memory and separation conditions of
the process.
These results improve and generalize the bounds obtained previously. Context
tree models have been introduced by Rissanen as a parsimonious generalization
of Markov models. Since then, they have been widely used in applied probability
and statistics
Using Avida to test the effects of natural selection on phylogenetic reconstruction methods
Phylogenetic trees group organisms by their ancestral relationships. There are a number of distinct algorithms used to reconstruct these trees from molecular sequence data, but different methods sometimes give conflicting results. Since there are few precisely known phylogenies, simulations are typically used to test the quality of reconstruction algorithms. These simulations randomly evolve strings of symbols to produce a tree, and then the algorithms are run with the tree leaves as inputs. Here we use Avida to test two widely used reconstruction methods, which gives us the chance to observe the effect of natural selection on tree reconstruction. We find that if the organisms undergo natural selection between branch points, the methods will be successful even on very large time scales. However, these algorithms often falter when selection is absent
Household Tree Planting in Tigrai, Northern Ethiopia: Tree Species, Purposes, and Determinants
Trees have multiple purposes in rural Ethiopia, providing significant economic and ecological benefits. Planting trees supplies rural households with wood products for their own consumption, as well for sale, and decreases soil degradation. We used cross-sectional household-level data to analyze the determinants of household tree planting and explored the most important tree attributes or purpose(s) that enhance the propensity to plant trees. We set up a sample selection framework that simultaneously took into account the two decisions of tree growers (whether or not to plant trees and how many) to analyze the determinants of tree planting. We used logistic regression to analyze the most important tree attributes that contribute to householdsâ tree-planting decisions. We found that land size, age, gender, tenure security, education, exogenous income, and agro-ecology increased both the propensity to plant trees and the amount of tree planting, while increased livestock holding impacted both decisions negatively. Our findings also suggested that households consider a number of attributes in making the decision to plant trees. These results can be used by policymakers to promote tree planting in the study area by trengthening tenure security and considering householdsâ selection of specific tree species for their attributes.tree plantin, tree species, tree attributes or purposes, sample selection, Tigrai, Ethiopia
Forest Garrote
Variable selection for high-dimensional linear models has received a lot of
attention lately, mostly in the context of l1-regularization. Part of the
attraction is the variable selection effect: parsimonious models are obtained,
which are very suitable for interpretation. In terms of predictive power,
however, these regularized linear models are often slightly inferior to machine
learning procedures like tree ensembles. Tree ensembles, on the other hand,
lack usually a formal way of variable selection and are difficult to visualize.
A Garrote-style convex penalty for trees ensembles, in particular Random
Forests, is proposed. The penalty selects functional groups of nodes in the
trees. These could be as simple as monotone functions of individual predictor
variables. This yields a parsimonious function fit, which lends itself easily
to visualization and interpretation. The predictive power is maintained at
least at the same level as the original tree ensemble. A key feature of the
method is that, once a tree ensemble is fitted, no further tuning parameter
needs to be selected. The empirical performance is demonstrated on a wide array
of datasets.Comment: 16 pages, 3 figure
Tree Selection
The Home & Garden Information Center provides research-based information on landscaping, gardening, plant health, household pests, food safety & preservation, and nutrition, physical activity & health. This HGIC fact sheet provides information on tree selection
Grading of parameters for urban tree inventories by city officials, arborists and academics using the Delphi method
Tree inventories are expensive to conduct and update, so every inventory carried out must be maximized. However, increasing the number of constituent parameters increases the cost of performing and updating the inventory, illustrating the need for careful parameter selection. This paper reports the results of a systematic expert rating of tree inventories aiming to quantify the relative importance of each parameter. Using the Delphi method, panels comprising city officials, arborists and academics rated a total of 148 parameters. In order of total mean score, the top ranking parameters, which can serve as a guide for decision-making at practical level and for standardization of tree inventories, were: Scientific name of the tree species and genera, Vitality, Coordinates, Hazard class and Identification number.
The study also examined whether the different responsibilities and usage of urban tree databases among organizations and people engaged in urban tree inventories affected their prioritization. The results revealed noticeable dissimilarities in the ranking of parameters between the panels, underlining the need for collaboration between the research community and those commissioning, administrating and conducting inventories. Only by applying such a transdisciplinary approach to parameter selection can urban tree inventories be strengthened and made more relevant
Using longitudinal survival probabilities to test field vigour estimates in sugar maple (Acer saccharum Marsh.)
Tree mortality is a major force driving forest dynamics. To foresters, however, tree mortality is often considered a loss in productivity. To reduce tree mortality, silvicultural systems, such as selection cuts, aim at removing trees that are more likely to die. In order to identify trees with higher risks of mortality, field classifications are employed that assess vigour based on external characteristics of trees. We used a novel longitudinal approach for estimating survival probabilities based on ring-width measurements, initially developed by Bigler and Bugmann [Bigler, C., Bugmann, H., 2004. Predicting the time of tree death using dendrochronological data. Ecol. Appl. 14 (3), 902-914], to parameterize a survival probability model for sugar maple (Acer saccharum Marsh.) and to test whether field-assessed tree vigour classes are corroborated by survival probabilities determined from radial growth history. Data from 56 dead and 321 live sugar maples were collected in stands in western Quebec (Canada) that had undergone a selection cut â10 years prior to sampling. Our results showed that tree vigour established from external defects and pathological symptoms, using the classification of Boulet [Boulet, B., 2005. D
- âŠ