5,912 research outputs found
Univariate decision tree induction using maximum margin classification
In many pattern recognition applications, first decision trees are used due to their simplicity and easily interpretable nature. In this paper, we propose a new decision tree learning algorithm called univariate margin tree where, for each continuous attribute, the best split is found using convex optimization. Our simulation results on 47 data sets show that the novel margin tree classifier performs at least as good as C4.5 and linear discriminant tree (LDT) with a similar time complexity. For two-class data sets, it generates significantly smaller trees than C4.5 and LDT without sacrificing from accuracy, and generates significantly more accurate trees than C4.5 and LDT for multiclass data sets with one-vs-rest methodology.Publisher's VersionAuthor Pre-Prin
Copula Calibration
We propose notions of calibration for probabilistic forecasts of general
multivariate quantities. Probabilistic copula calibration is a natural analogue
of probabilistic calibration in the univariate setting. It can be assessed
empirically by checking for the uniformity of the copula probability integral
transform (CopPIT), which is invariant under coordinate permutations and
coordinatewise strictly monotone transformations of the predictive distribution
and the outcome. The CopPIT histogram can be interpreted as a generalization
and variant of the multivariate rank histogram, which has been used to check
the calibration of ensemble forecasts. Climatological copula calibration is an
analogue of marginal calibration in the univariate setting. Methods and tools
are illustrated in a simulation study and applied to compare raw numerical
model and statistically postprocessed ensemble forecasts of bivariate wind
vectors
copulaedas: An R Package for Estimation of Distribution Algorithms Based on Copulas
The use of copula-based models in EDAs (estimation of distribution
algorithms) is currently an active area of research. In this context, the
copulaedas package for R provides a platform where EDAs based on copulas can be
implemented and studied. The package offers complete implementations of various
EDAs based on copulas and vines, a group of well-known optimization problems,
and utility functions to study the performance of the algorithms. Newly
developed EDAs can be easily integrated into the package by extending an S4
class with generic functions for their main components. This paper presents
copulaedas by providing an overview of EDAs based on copulas, a description of
the implementation of the package, and an illustration of its use through
examples. The examples include running the EDAs defined in the package,
implementing new algorithms, and performing an empirical study to compare the
behavior of different algorithms on benchmark functions and a real-world
problem
Pair-copula constructions of multiple dependence
Building on the work of Bedford, Cooke and Joe, we show how multivariate data, which exhibit complex patterns of dependence in the tails, can be modelled using a cascade of pair-copulae, acting on two variables at a time. We use the pair-copula decomposition of a general multivariate distribution and propose a method to perform inference. The model construction is hierarchical in nature, the various levels corresponding to the incorporation of more variables in the conditioning sets, using pair-copulae as simple building blocs. Pair-copula decomposed models also represent a very flexible way to construct higher-dimensional coplulae. We apply the methodology to a financial data set. Our approach represents the first step towards developing of an unsupervised algorithm that explores the space of possible pair-copula models, that also can be applied to huge data sets automatically
- …