3,145 research outputs found

    Alternating model trees

    Get PDF
    Model tree induction is a popular method for tackling regression problems requiring interpretable models. Model trees are decision trees with multiple linear regression models at the leaf nodes. In this paper, we propose a method for growing alternating model trees, a form of option tree for regression problems. The motivation is that alternating decision trees achieve high accuracy in classification problems because they represent an ensemble classifier as a single tree structure. As in alternating decision trees for classifi-cation, our alternating model trees for regression contain splitter and prediction nodes, but we use simple linear regression functions as opposed to constant predictors at the prediction nodes. Moreover, additive regression using forward stagewise modeling is applied to grow the tree rather than a boosting algorithm. The size of the tree is determined using cross-validation. Our empirical results show that alternating model trees achieve significantly lower squared error than standard model trees on several regression datasets

    Random model trees: an effective and scalable regression method

    Get PDF
    We present and investigate ensembles of randomized model trees as a novel regression method. Such ensembles combine the scalability of tree-based methods with predictive performance rivaling the state of the art in numeric prediction. An extensive empirical investigation shows that Random Model Trees produce predictive performance which is competitive with state-of-the-art methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude

    Scattering measurements on natural and model trees

    Get PDF
    The acoustical back scattering from a simple scale model of a tree has been experimentally measured. The model consisted of a trunk and six limbs, each with 4 branches; no foliage or twigs were included. The data from the anechoic chamber measurements were then mathematically combined to construct the effective back scattering from groups of trees. Also, initial measurements have been conducted out-of-doors on a single tree in an open field in order to characterize its acoustic scattering as a function of azimuth angle. These measurements were performed in the spring, prior to leaf development. The data support a statistical model of forest scattering; the scattered signal spectrum is highly irregular but with a remarkable general resemblance to the incident signal spectrum. Also, the scattered signal's spectra showed little dependence upon scattering angle

    On the variational distance of two trees

    Full text link
    A widely studied model for generating sequences is to ``evolve'' them on a tree according to a symmetric Markov process. We prove that model trees tend to be maximally ``far apart'' in terms of variational distance.Comment: Published at http://dx.doi.org/10.1214/105051606000000196 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    How wind and sun shape trees into fractals?

    Get PDF
    Trees are self-similar branching structures, hierarchically organized with longer and thicker branches near the roots. With a mechanically-based numerical model, we show how self-similarity can emerge through natural selection. In this model, trees grow into fractal structures to promote efficient photosynthesis in a competing environment. In addition, branch diameters increase in response to wind-induced loads. Remarkably, the virtual tree species emerging from this model have the same self-similar properties as those measured on conifers and angiosperms.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Inferring gene regression networks with model trees

    Get PDF
    Background: Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results: We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions: REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET.Ministerio de Ciencia e Innovación TIN2011-68084-C02-00Ministerio de Ciencia e Innovación PCI2006-A7-0575Junta de Andalucia P07-TIC- 02611Junta de Andalucía TIC-20

    Inferring gene regression networks with model trees

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities.</p> <p>Results</p> <p>We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named R<smcaps>EG</smcaps>N<smcaps>ET</smcaps>, is experimentally tested on two well-known data sets: <it>Saccharomyces Cerevisiae </it>and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that R<smcaps>EG</smcaps>N<smcaps>ET</smcaps> performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods.</p> <p>Conclusions</p> <p>R<smcaps>EG</smcaps>N<smcaps>ET</smcaps> generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of R<smcaps>EG</smcaps>N<smcaps>ET</smcaps>.</p

    Bayesian Additive Regression Trees with Model Trees

    Full text link
    Bayesian Additive Regression Trees (BART) is a tree-based machine learning method that has been successfully applied to regression and classification problems. BART assumes regularisation priors on a set of trees that work as weak learners and is very flexible for predicting in the presence of non-linearity and high-order interactions. In this paper, we introduce an extension of BART, called Model Trees BART (MOTR-BART), that considers piecewise linear functions at node levels instead of piecewise constants. In MOTR-BART, rather than having a unique value at node level for the prediction, a linear predictor is estimated considering the covariates that have been used as the split variables in the corresponding tree. In our approach, local linearities are captured more efficiently and fewer trees are required to achieve equal or better performance than BART. Via simulation studies and real data applications, we compare MOTR-BART to its main competitors. R code for MOTR-BART implementation is available at https://github.com/ebprado/MOTR-BART
    corecore