367,062 research outputs found
Using parametric classification trees for model selection with applications to financial risk management
We describe two parametric classification tree methods, which allow formal selection of a member of a class of generalised distributions. In the paper we consider generalised Beta distributions for non-negative random variables and the generalised skew-Student distribution for random variables distributed on the real line. We introduce a class of symmetric generalised multivariate Student distributions, members of which may also be selected using the classification trees. We present two versions of the parametric classification tree: specific to general and general to specific. We apply the classification methods to daily returns on stocks from a selection of 15 major, mid-cap and emerging markets. The results show that the majority of return distributions follow Student’s t, but that a non-negligible minority follow a symmetric generalised Student distribution. We confirm a well-known stylised fact about skewness: it tends not to be persistent. By contrast, kurtosis is persistent. Using the symmetric generalised multivariate Student distribution, we present a risk management study based on efficient portfolios constructed from UKFTSE250 stocks and specifically concerned with the computation of value at risk. The case study demonstrates that the model selection procedures based on the classification trees lead to more accurate computation of VaR than those based on the normal distribution or on non-parametric approaches. The study also shows that the normal distribution may be used for VaR computations for larger portfolios when the holding period is longer
Alternating model trees
Model tree induction is a popular method for tackling regression problems requiring interpretable models. Model trees are decision trees with multiple linear regression models at the leaf nodes. In this paper, we propose a method for growing alternating model trees, a form of option tree for regression problems. The motivation is that alternating decision trees achieve high accuracy in classification problems because they represent an ensemble classifier as a single tree structure. As in alternating decision trees for classifi-cation, our alternating model trees for regression contain splitter and prediction nodes, but we use simple linear regression functions as opposed to constant predictors at the prediction nodes. Moreover, additive regression using forward stagewise modeling is applied to grow the tree rather than a boosting algorithm. The size of the tree is determined using cross-validation. Our empirical results show that alternating model trees achieve significantly lower squared error than standard model trees on several regression datasets
Classification under Streaming Emerging New Classes: A Solution using Completely Random Trees
This paper investigates an important problem in stream mining, i.e.,
classification under streaming emerging new classes or SENC. The common
approach is to treat it as a classification problem and solve it using either a
supervised learner or a semi-supervised learner. We propose an alternative
approach by using unsupervised learning as the basis to solve this problem. The
SENC problem can be decomposed into three sub problems: detecting emerging new
classes, classifying for known classes, and updating models to enable
classification of instances of the new class and detection of more emerging new
classes. The proposed method employs completely random trees which have been
shown to work well in unsupervised learning and supervised learning
independently in the literature. This is the first time, as far as we know,
that completely random trees are used as a single common core to solve all
three sub problems: unsupervised learning, supervised learning and model update
in data streams. We show that the proposed unsupervised-learning-focused method
often achieves significantly better outcomes than existing
classification-focused methods
A Nonparametric Ensemble Binary Classifier and its Statistical Properties
In this work, we propose an ensemble of classification trees (CT) and
artificial neural networks (ANN). Several statistical properties including
universal consistency and upper bound of an important parameter of the proposed
classifier are shown. Numerical evidence is also provided using various real
life data sets to assess the performance of the model. Our proposed
nonparametric ensemble classifier doesn't suffer from the `curse of
dimensionality' and can be used in a wide variety of feature selection cum
classification problems. Performance of the proposed model is quite better when
compared to many other state-of-the-art models used for similar situations
Recommended from our members
Riparian vegetation classification from airborne laser scanning data with an emphasis on cottonwood trees
The high point density of airborne laser mapping systems enables achieving a detailed description of geographic objects and the terrain. Growing experience indicates, however, that extracting useful information directly from the data can be difficult. In this study, small-footprint lidar data were used to differentiate between young, mature, and old cottonwood trees in the San Pedro River Basin near Benson, Arizona, USA. The lidar data were acquired in June 2003, using the Optech Incorporated ALTM 1233 (Optech Incorporated, Toronto, Ont.), during flyovers conducted at an altitude of 750 m. The lidar data were preprocessed to create a two-band image of the study site: a high-accuracy canopy altitude model band, and a near-infrared intensity band. These lidar-derived images provided the basis for supervised classification of cottonwood age categories, using a maximum likelihood algorithm. The results of classification illustrate the potential of airborne lidar data to differentiate age classes of cottonwood trees for riparian areas quickly and accurately. © 2006, Taylor & Francis Group, LLC. All rights reserved
Investigating Evaluation Measures in Ant Colony Algorithms for Learning Decision Tree Classifiers
Ant-Tree-Miner is a decision tree induction algorithm that is based on the Ant Colony Optimization (ACO) meta- heuristic. Ant-Tree-Miner-M is a recently introduced extension of Ant-Tree-Miner that learns multi-tree classification models. A multi-tree model consists of multiple decision trees, one for each class value, where each class-based decision tree is responsible for discriminating between its class value and all other values present in the class domain (one vs. all). In this paper, we investigate the use of 10 different classification quality evaluation measures in Ant-Tree-Miner-M, which are used for both candidate model evaluation and model pruning. Our experimental results, using 40 popular benchmark datasets, identify several quality functions that substantially improve on the simple Accuracy quality function that was previously used in Ant-Tree-Miner-M
- …