655 research outputs found

    Evaluation of the Performance of the Markov Blanket Bayesian Classifier Algorithm

    Full text link
    The Markov Blanket Bayesian Classifier is a recently-proposed algorithm for construction of probabilistic classifiers. This paper presents an empirical comparison of the MBBC algorithm with three other Bayesian classifiers: Naive Bayes, Tree-Augmented Naive Bayes and a general Bayesian network. All of these are implemented using the K2 framework of Cooper and Herskovits. The classifiers are compared in terms of their performance (using simple accuracy measures and ROC curves) and speed, on a range of standard benchmark data sets. It is concluded that MBBC is competitive in terms of speed and accuracy with the other algorithms considered.Comment: 9 pages: Technical Report No. NUIG-IT-011002, Department of Information Technology, National University of Ireland, Galway (2002

    Context-specific independence in graphical models

    Get PDF
    The theme of this thesis is context-speci c independence in graphical models. Considering a system of stochastic variables it is often the case that the variables are dependent of each other. This can, for instance, be seen by measuring the covariance between a pair of variables. Using graphical models, it is possible to visualize the dependence structure found in a set of stochastic variables. Using ordinary graphical models, such as Markov networks, Bayesian networks, and Gaussian graphical models, the type of dependencies that can be modeled is limited to marginal and conditional (in)dependencies. The models introduced in this thesis enable the graphical representation of context-speci c independencies, i.e. conditional independencies that hold only in a subset of the outcome space of the conditioning variables. In the articles included in this thesis, we introduce several types of graphical models that can represent context-speci c independencies. Models for both discrete variables and continuous variables are considered. A wide range of properties are examined for the introduced models, including identi ability, robustness, scoring, and optimization. In one article, a predictive classi er which utilizes context-speci c independence models is introduced. This classi er clearly demonstrates the potential bene ts of the introduced models. The purpose of the material included in the thesis prior to the articles is to provide the basic theory needed to understand the articles.Temat för avhandlingen Àr kontextspecifikt oberoende i grafiska modeller. Inom sannolikhetslÀra och statistik Àr en stokastisk variabel en variabel som pÄverkas av slumpen. Till skillnad frÄn vanliga matematiska variabler antar en stokastisk variabel ett givet vÀrde med en viss sannolikhet. För en mÀngd stokastiska variabler gÀller det i regel att variablerna Àr beroende av varandra. Graden av beroende kan t.ex. mÀtas med kovariansen mellan tvÄ variabler. Med hjÀlp av grafiska modeller Àr det möjligt att visualisera beroendestrukturen för ett system av stokastiska variabler. Med hjÀlp av traditionella grafiska modeller sÄsom Markov nÀtverk, Bayesianska nÀtverk och Gaussiska grafiska modeller Àr det möjligt att visualisera marginellt och betingat oberoende. De modeller som introduceras i denna avhandling möjliggör en grafisk representation av kontextspecifikt oberoende, d.v.s. betingat oberoende som endast hÄller i en delmÀngd av de betingande variablernas utfallsrum. I artiklarna som inkluderats i avhandlingen introduceras flera typer av grafiska modeller som kan representera kontextspecifika oberoende. BÄde diskreta och kontinuerliga system behandlas. För dessa modeller undersöks mÄnga egenskaper inklusive identifierbarhet, stabilitet, modelljÀmförelse och optimering. I en artikel introduceras en prediktiv klassificerare som utnyttjar kontextspecifikt oberoende i grafiska modeller. Denna klassificerare visar tydligt hur anvÀndningen av kontextspecifika oberoende kan leda till förbÀttrade resultat i praktiska tillÀmpningar

    Tpda2 Algorithm for Learning Bn Structure From Missing Value and Outliers in Data Mining

    Full text link
    Three-Phase Dependency Analysis (TPDA) algorithm was proved as most efficient algorithm (which requires at most O(N4) Conditional Independence (CI) tests). By integrating TPDA with "node topological sort algorithm", it can be used to learn Bayesian Network (BN) structure from missing value (named as TPDA1 algorithm). And then, outlier can be reduced by applying an "outlier detection & removal algorithm" as pre-processing for TPDA1. TPDA2 algorithm proposed consists of those ideas, outlier detection & removal, TPDA, and node topological sort node

    Supervised machine learning algorithms for the estimation of the probability of default in corporate credit risk

    Get PDF
    This thesis investigates the application of non-linear supervised machine learning algorithms for estimating Probability of Default (PD) of corporate clients. To achieve this, the thesis is separated into three different experiments: 1. The first experiment investigates a wrapper feature selection method and its application on the support vector machines (SVMs) and logistic regression (LR). The logistic regression model is the most popular approach used for estimating PD in a rich default portfolio. However, other alternatives to PD estimation are available. SVMs method is compared to the logistic regression model using the proposed feature selection method. 2. The second experiment investigates the application of artificial neural networks (ANNs) for estimating PD of corporate clients. In particular ANNs are regularized and trained both with classical and Bayesian approach. Furthermore, different network architectures are explored and specifically the Bayesian estimation and regularization is compared to the classical estimation and regularization. 3. The third experiment investigates the k-Nearest Neighbours algorithm (KNNs). This algorithm is trained using both Bayesian and classical methods. KNNs could be efficiently applied to estimating PD. In addition, other supervised machine learning algorithms such as Decision trees (DTs), Linear discriminant analysis (LDA) and Naive Bayes (NB) were applied and their performance summarized and compared to that of the SVMs, ANNs, KNNs and logistic regression. The contribution of this thesis to science is to provide efficient and at the same time applicable methods for estimating PD of corporate clients. This thesis contributes to the existing literature in a number of ways. 1. First, this research proposes an innovative feature selection method for SVMs. 2. Second, this research proposes an innovative Bayesian estimation methods to regularize ANNs. 3. Third, this research proposes an innovative Bayesian approaches to the estimation of KNNs. Nonetheless, the objective of the research is to promote the use of the Bayesian non-linear supervised machine learning methods that are currently not heavily applied in the industry for PD estimation of corporate clients

    Quantum computing for finance

    Full text link
    Quantum computers are expected to surpass the computational capabilities of classical computers and have a transformative impact on numerous industry sectors. We present a comprehensive summary of the state of the art of quantum computing for financial applications, with particular emphasis on stochastic modeling, optimization, and machine learning. This Review is aimed at physicists, so it outlines the classical techniques used by the financial industry and discusses the potential advantages and limitations of quantum techniques. Finally, we look at the challenges that physicists could help tackle
    • 

    corecore