Search CORE

1,535 research outputs found

VC-dimension of univariate decision trees

Author: Yıldız Olcay Taner
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2015
Field of study

PubMed ID: 25594983In this paper, we give and prove the lower bounds of the Vapnik-Chervonenkis (VC)-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. Via a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively, we show that our VC-dimension bounds are tight for simple trees. To verify that the VC-dimension bounds are useful, we also use them to get VC-generalization bounds for complexity control using structural risk minimization in decision trees, i.e., pruning. Our simulation results show that structural risk minimization pruning using the VC-dimension bounds finds trees that are more accurate as those pruned using cross validation.Publisher's VersionAuthor Post Prin

Isik University Academic Open Access

Model selection in omnivariate decision trees using Structural Risk Minimization

Author: Yıldız Olcay Taner
Publication venue: 'Elsevier BV'
Publication date: 01/12/2011
Field of study

As opposed to trees that use a single type of decision node, an omnivariate decision tree contains nodes of different types. We propose to use Structural Risk Minimization (SRM) to choose between node types in omnivariate decision tree construction to match the complexity of a node to the complexity of the data reaching that node. In order to apply SRM for model selection, one needs the VC-dimension of the candidate models. In this paper, we first derive the VC-dimension of the univariate model, and estimate the VC-dimension of all three models (univariate, linear multivariate or quadratic multivariate) experimentally. Second, we compare SRM with other model selection techniques including Akaike's Information Criterion (AIC), Bayesian Information Criterion (BIC) and cross-validation (CV) on standard datasets from the UCI and Delve repositories. We see that SRM induces omnivariate trees that have a small percentage of multivariate nodes close to the root and they generalize more or at least as accurately as those constructed using other model selection techniques.The authors thank the three anonymous referees and the editor for their constructive comments, pointers to related literature, and pertinent questions which allowed us to better situate our work as well as organize the ms and improve the presentation. This work has been supported by the Turkish Scientific Technical Research Council TUBITAK EEEAG 107E127Publisher's VersionAuthor Pre-Prin

Isik University Academic Open Access

Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic Architectures

Author: Kersting Kristian
Liebig Thomas
Molina Alejandro
Peharz Robert
Shao Xiaoting
Stelzner Karl
Vergari Antonio
Publication venue
Publication date: 01/01/2019
Field of study

Probabilistic graphical models are a central tool in AI; however, they are generally not as expressive as deep neural models, and inference is notoriously hard and slow. In contrast, deep probabilistic models such as sum-product networks (SPNs) capture joint distributions in a tractable fashion, but still lack the expressive power of intractable models based on deep neural networks. Therefore, we introduce conditional SPNs (CSPNs), conditional density estimators for multivariate and potentially hybrid domains which allow harnessing the expressive power of neural networks while still maintaining tractability guarantees. One way to implement CSPNs is to use an existing SPN structure and condition its parameters on the input, e.g., via a deep neural network. This approach, however, might misrepresent the conditional independence structure present in data. Consequently, we also develop a structure-learning approach that derives both the structure and parameters of CSPNs from data. Our experimental evidence demonstrates that CSPNs are competitive with other probabilistic models and yield superior performance on multilabel image classification compared to mean field and mixture density networks. Furthermore, they can successfully be employed as building blocks for structured probabilistic models, such as autoregressive image models.Comment: 13 pages, 6 figure

arXiv.org e-Print Archive

TUbiblio

Bregman Voronoi Diagrams: Properties, Algorithms and Applications

Author: A. Banerjee
A. Ben-Hur
B. Chazelle
B. Chazelle
C. Atkinson
F. Aurenhammer
F. Aurenhammer
F. Aurenhammer
F. Labelle
F. Nielsen
Frank Nielsen
G. Even
H. Brönnimann
I. Csiszár
J. Matousek
J.-D. Boissonnat
J.-D. Boissonnat
J.-D. Boissonnat
Jean-Daniel Boissonnat
K. Clarkson
K. Onishi
K. Sadakane
L.M. Bregman
P. McMullen
R. Descartes
R. Klein
R. Nock
R.T. Rockafellar
Richard Nock
S. Amari
T.M. Chan
V. Silva de
V.T. Rajan
Y.A. Censor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

The Voronoi diagram of a finite set of objects is a fundamental geometric structure that subdivides the embedding space into regions, each region consisting of the points that are closer to a given object than to the others. We may define many variants of Voronoi diagrams depending on the class of objects, the distance functions and the embedding space. In this paper, we investigate a framework for defining and building Voronoi diagrams for a broad class of distance functions called Bregman divergences. Bregman divergences include not only the traditional (squared) Euclidean distance but also various divergence measures based on entropic functions. Accordingly, Bregman Voronoi diagrams allow to define information-theoretic Voronoi diagrams in statistical parametric spaces based on the relative entropy of distributions. We define several types of Bregman diagrams, establish correspondences between those diagrams (using the Legendre transformation), and show how to compute them efficiently. We also introduce extensions of these diagrams, e.g. k-order and k-bag Bregman Voronoi diagrams, and introduce Bregman triangulations of a set of points and their connexion with Bregman Voronoi diagrams. We show that these triangulations capture many of the properties of the celebrated Delaunay triangulation. Finally, we give some applications of Bregman Voronoi diagrams which are of interest in the context of computational geometry and machine learning.Comment: Extend the proceedings abstract of SODA 2007 (46 pages, 15 figures

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

Predicting Bankruptcy with Support Vector Machines

Author: Dorothea Schäfer
Rouslan A. Moro
Wolfgang Härdle
Publication venue
Publication date
Field of study

The purpose of this work is to introduce one of the most promising among recently developed statistical techniques – the support vector machine (SVM) – to corporate bankruptcy analysis. An SVM is implemented for analysing such predictors as financial ratios. A method of adapting it to default probability estimation is proposed. A survey of practically applied methods is given. This work shows that support vector machines are capable of extracting useful information from financial data, although extensive data sets are required in order to fully utilize their classification power.support vector machine, classification method, statistical learning theory, electric load prediction, optical character recognition, predicting bankruptcy, risk classification

Research Papers in Economics

Learning Machines Supporting Bankruptcy Prediction

Author: Linda Hoffmann
Rouslan Moro
Wolfgang Karl Härdle
Publication venue
Publication date
Field of study

In many economic applications it is desirable to make future predictions about the financial status of a company. The focus of predictions is mainly if a company will default or not. A support vector machine (SVM) is one learning method which uses historical data to establish a classification rule called a score or an SVM. Companies with scores above zero belong to one group and the rest to another group. Estimation of the probability of default (PD) values can be calculated from the scores provided by an SVM. The transformation used in this paper is a combination of weighting ranks and of smoothing the results using the PAV algorithm. The conversion is then monotone. This discussion paper is based on the Creditreform database from 1997 to 2002. The indicator variables were converted to financial ratios; it transpired out that eight of the 25 were useful for the training of the SVM. The results showed that those ratios belong to activity, profitability, liquidity and leverage. Finally, we conclude that SVMs are capable of extracting the necessary information from financial balance sheets and then to predict the future solvency or insolvent of a company. Banks in particular will benefit from these results by allowing them to be more aware of their risk when lending money.Support Vector Machine, Bankruptcy, Default Probabilities Prediction, Profitability

Research Papers in Economics