3,145 research outputs found
Alternating model trees
Model tree induction is a popular method for tackling regression problems requiring interpretable models. Model trees are decision trees with multiple linear regression models at the leaf nodes. In this paper, we propose a method for growing alternating model trees, a form of option tree for regression problems. The motivation is that alternating decision trees achieve high accuracy in classification problems because they represent an ensemble classifier as a single tree structure. As in alternating decision trees for classifi-cation, our alternating model trees for regression contain splitter and prediction nodes, but we use simple linear regression functions as opposed to constant predictors at the prediction nodes. Moreover, additive regression using forward stagewise modeling is applied to grow the tree rather than a boosting algorithm. The size of the tree is determined using cross-validation. Our empirical results show that alternating model trees achieve significantly lower squared error than standard model trees on several regression datasets
Random model trees: an effective and scalable regression method
We present and investigate ensembles of randomized model trees as a novel regression method. Such ensembles combine the scalability of tree-based methods with predictive performance rivaling the state of the art in numeric prediction. An extensive empirical investigation shows that Random Model Trees produce predictive performance which is competitive with state-of-the-art methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training
and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude
Scattering measurements on natural and model trees
The acoustical back scattering from a simple scale model of a tree has been experimentally measured. The model consisted of a trunk and six limbs, each with 4 branches; no foliage or twigs were included. The data from the anechoic chamber measurements were then mathematically combined to construct the effective back scattering from groups of trees. Also, initial measurements have been conducted out-of-doors on a single tree in an open field in order to characterize its acoustic scattering as a function of azimuth angle. These measurements were performed in the spring, prior to leaf development. The data support a statistical model of forest scattering; the scattered signal spectrum is highly irregular but with a remarkable general resemblance to the incident signal spectrum. Also, the scattered signal's spectra showed little dependence upon scattering angle
On the variational distance of two trees
A widely studied model for generating sequences is to ``evolve'' them on a
tree according to a symmetric Markov process. We prove that model trees tend to
be maximally ``far apart'' in terms of variational distance.Comment: Published at http://dx.doi.org/10.1214/105051606000000196 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
How wind and sun shape trees into fractals?
Trees are self-similar branching structures, hierarchically organized with longer and thicker branches near the roots. With a mechanically-based numerical model, we show how self-similarity can emerge through natural selection. In this model, trees grow into fractal structures to promote efficient photosynthesis in a competing environment. In addition, branch diameters increase in response to wind-induced loads. Remarkably, the virtual tree species emerging from this model have the same self-similar properties as those measured on conifers and angiosperms.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech
Inferring gene regression networks with model trees
Background: Novel strategies are required in order to handle the huge amount of data produced by microarray
technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between
genes building the so-called gene co-expression networks. They are typically generated using correlation statistics
as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two
genes have a strong global similarity but do not detect local similarities.
Results: We propose model trees as a method to identify gene interaction networks. While correlation-based
methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the
remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into
account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to
control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two
well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are
tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the
results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at
detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods.
Conclusions: REGNET generates gene association networks from gene expression data, and differs from
correlation-based methods in that the relationship between one gene and others is calculated simultaneously.
Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression
functions. They are very often more precise than linear regression models because they can add just different
linear regressions to separate areas of the search space favoring to infer localized similarities over a more global
similarity. Furthermore, experimental results show the good performance of REGNET.Ministerio de Ciencia e Innovación TIN2011-68084-C02-00Ministerio de Ciencia e Innovación PCI2006-A7-0575Junta de Andalucia P07-TIC- 02611Junta de Andalucía TIC-20
Inferring gene regression networks with model trees
<p>Abstract</p> <p>Background</p> <p>Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities.</p> <p>Results</p> <p>We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named R<smcaps>EG</smcaps>N<smcaps>ET</smcaps>, is experimentally tested on two well-known data sets: <it>Saccharomyces Cerevisiae </it>and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that R<smcaps>EG</smcaps>N<smcaps>ET</smcaps> performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods.</p> <p>Conclusions</p> <p>R<smcaps>EG</smcaps>N<smcaps>ET</smcaps> generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of R<smcaps>EG</smcaps>N<smcaps>ET</smcaps>.</p
Bayesian Additive Regression Trees with Model Trees
Bayesian Additive Regression Trees (BART) is a tree-based machine learning
method that has been successfully applied to regression and classification
problems. BART assumes regularisation priors on a set of trees that work as
weak learners and is very flexible for predicting in the presence of
non-linearity and high-order interactions. In this paper, we introduce an
extension of BART, called Model Trees BART (MOTR-BART), that considers
piecewise linear functions at node levels instead of piecewise constants. In
MOTR-BART, rather than having a unique value at node level for the prediction,
a linear predictor is estimated considering the covariates that have been used
as the split variables in the corresponding tree. In our approach, local
linearities are captured more efficiently and fewer trees are required to
achieve equal or better performance than BART. Via simulation studies and real
data applications, we compare MOTR-BART to its main competitors. R code for
MOTR-BART implementation is available at https://github.com/ebprado/MOTR-BART
- …