61 research outputs found

    Bias in random forest variable importance measures: Illustrations, sources and a solution

    Get PDF
    BACKGROUND: Variable importance measures for random forests have been receiving increased attention as a means of variable selection in many classification tasks in bioinformatics and related scientific fields, for instance to select a subset of genetic markers relevant for the prediction of a certain disease. We show that random forest variable importance measures are a sensible means for variable selection in many applications, but are not reliable in situations where potential predictor variables vary in their scale of measurement or their number of categories. This is particularly important in genomics and computational biology, where predictors often include variables of different types, for example when predictors include both sequence data and continuous variables such as folding energy, or when amino acid sequence data show different numbers of categories. RESULTS: Simulation studies are presented illustrating that, when random forest variable importance measures are used with data of varying types, the results are misleading because suboptimal predictor variables may be artificially preferred in variable selection. The two mechanisms underlying this deficiency are biased variable selection in the individual classification trees used to build the random forest on one hand, and effects induced by bootstrap sampling with replacement on the other hand. CONCLUSION: We propose to employ an alternative implementation of random forests, that provides unbiased variable selection in the individual classification trees. When this method is applied using subsampling without replacement, the resulting variable importance measures can be used reliably for variable selection even in situations where the potential predictor variables vary in their scale of measurement or their number of categories. The usage of both random forest algorithms and their variable importance measures in the R system for statistical computing is illustrated and documented thoroughly in an application re-analyzing data from a study on RNA editing. Therefore the suggested method can be applied straightforwardly by scientists in bioinformatics research

    Multiple Determinants of Whole and Regional Brain Volume among Terrestrial Carnivorans

    Get PDF
    Mammalian brain volumes vary considerably, even after controlling for body size. Although several hypotheses have been proposed to explain this variation, most research in mammals on the evolution of encephalization has focused on primates, leaving the generality of these explanations uncertain. Furthermore, much research still addresses only one hypothesis at a time, despite the demonstrated importance of considering multiple factors simultaneously. We used phylogenetic comparative methods to investigate simultaneously the importance of several factors previously hypothesized to be important in neural evolution among mammalian carnivores, including social complexity, forelimb use, home range size, diet, life history, phylogeny, and recent evolutionary changes in body size. We also tested hypotheses suggesting roles for these variables in determining the relative volume of four brain regions measured using computed tomography. Our data suggest that, in contrast to brain size in primates, carnivoran brain size may lag behind body size over evolutionary time. Moreover, carnivore species that primarily consume vertebrates have the largest brains. Although we found no support for a role of social complexity in overall encephalization, relative cerebrum volume correlated positively with sociality. Finally, our results support negative relationships among different brain regions after accounting for overall endocranial volume, suggesting that increased size of one brain regions is often accompanied by reduced size in other regions rather than overall brain expansion

    Designing an Online Dungeons & Dragons Experience for Primary School Children

    No full text
    In this work, we present the results of a role-playing game experience carried out with a group of 9- to 12-year-old children during the COVID-19 emergence. The ‘harmony in education’ approach has been used to adapt the game design to the constraints imposed by the online context and the young age of the students involved. The results show the effectiveness of the approach in terms of 21st-century skills training with particular evidence on perspective-taking
    • …
    corecore