3,953 research outputs found
An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests
Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, that can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine and bioinformatics within the past few years.
High dimensional problems are common not only in genetics, but also in some areas of psychological research, where only few subjects can be measured due to time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications, and provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions.
The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application.
Application of the methods is illustrated using freely available implementations in the R system for statistical computing
Active Classification for POMDPs: a Kalman-like State Estimator
The problem of state tracking with active observation control is considered
for a system modeled by a discrete-time, finite-state Markov chain observed
through conditionally Gaussian measurement vectors. The measurement model
statistics are shaped by the underlying state and an exogenous control input,
which influence the observations' quality. Exploiting an innovations approach,
an approximate minimum mean-squared error (MMSE) filter is derived to estimate
the Markov chain system state. To optimize the control strategy, the associated
mean-squared error is used as an optimization criterion in a partially
observable Markov decision process formulation. A stochastic dynamic
programming algorithm is proposed to solve for the optimal solution. To enhance
the quality of system state estimates, approximate MMSE smoothing estimators
are also derived. Finally, the performance of the proposed framework is
illustrated on the problem of physical activity detection in wireless body
sensing networks. The power of the proposed framework lies within its ability
to accommodate a broad spectrum of active classification applications including
sensor management for object classification and tracking, estimation of sparse
signals and radar scheduling.Comment: 38 pages, 6 figure
- …