4 research outputs found

    A Comparison of Decision Tree Classifiers for Automatic Diagnosis of Speech Recognition Errors

    Get PDF
    Present speech recognition systems are becoming more complex due to technology advances, optimizations and special requirements such as small computation and memory footprints. Proper handling of system failures can be seen as a kind of fault diagnosis. Motivated by the success of decision tree diagnosis in other scientific fields and by their successful application in speech recognition in the last decade, we contribute to the topic mainly in terms of comparison of different types of decision trees. Five styles are examined: CART (testing three different splitting criteria), C4.5, and then Minimum Message Length (MML), strict MML and Bayesian styles decision trees. We apply these techniques to data of computer speech recognition fed by intrinsically variable speech. We conclude that for this task, CART technique outperforms C4.5 in terms of better classification for ASR failures

    New Graphical Model for Computing Optimistic Decisions in Possibility Theory Framework

    Get PDF
    This paper first proposes a new graphical model for decision making under uncertainty based on min-based possibilistic networks. A decision problem under uncertainty is described by means of two distinct min-based possibilistic networks: the first one expresses agent's knowledge while the second one encodes agent's preferences representing a qualitative utility. We then propose an efficient algorithm for computing optimistic optimal decisions using our new model for representing possibilistic decision making under uncertainty. We show that the computation of optimal decisions comes down to compute a normalization degree of the junction tree associated with the graph resulting from the fusion of agent's beliefs and preferences. This paper also proposes an alternative way for computing optimal optimistic decisions. The idea is to transform the two possibilistic networks into two equivalent possibilistic logic knowledge bases, one representing agent's knowledge and the other represents agent's preferences. We show that computing an optimal optimistic decision comes down to compute the inconsistency degree of the union of the two possibilistic bases augmented with a given decision

    Decision tree classifiers for incident call data sets

    Get PDF
    Information technology (IT) has become one of the key technologies for economic and social development in any organization. Therefore the management of Information technology incidents, and particularly in the area of resolving the problem very fast, is of concern to Information technology managers. Delays can result when incorrect subjects are assigned to Information technology incident calls: because the person sent to remedy the problem has the wrong expertise or has not brought with them the software or hardware they need to help that user. In the case study used for this work, there are no management checks in place to verify the assigning of incident description subjects. This research aims to develop a method that will tackle the problem of wrongly assigned subjects for incident descriptions. In particular, this study explores the Information technology incident calls database of an oil and gas company as a case study. The approach was to explore the Information technology incident descriptions and their assigned subjects; thereafter the correctly-assigned records were used for training decision tree classification algorithms using Waikato Environment for Knowledge Analysis (WEKA) software. Finally, the records incorrectly assigned a subject by human operators were used for testing. The J48 algorithm gave the best performance and accuracy, and was able to correctly assign subjects to 81% of the records wrongly classified by human operators

    Statistical Methods to Enhance Clinical Prediction with High-Dimensional Data and Ordinal Response

    Get PDF
    Der technologische Fortschritt ermöglicht es heute, die moleculare Konfiguration einzelner Zellen oder ganzer Gewebeproben zu untersuchen. Solche in großen Mengen produzierten hochdimensionalen Omics-Daten aus der Molekularbiologie lassen sich zu immer niedrigeren Kosten erzeugen und werden so immer häufiger auch in klinischen Fragestellungen eingesetzt. Personalisierte Diagnose oder auch die Vorhersage eines Behandlungserfolges auf der Basis solcher Hochdurchsatzdaten stellen eine moderne Anwendung von Techniken aus dem maschinellen Lernen dar. In der Praxis werden klinische Parameter, wie etwa der Gesundheitszustand oder die Nebenwirkungen einer Therapie, häufig auf einer ordinalen Skala erhoben (beispielsweise gut, normal, schlecht). Es ist verbreitet, Klassifikationsproblme mit ordinal skaliertem Endpunkt wie generelle Mehrklassenproblme zu behandeln und somit die Information, die in der Ordnung zwischen den Klassen enthalten ist, zu ignorieren. Allerdings kann das Vernachlässigen dieser Information zu einer verminderten Klassifikationsgüte führen oder sogar eine ungünstige ungeordnete Klassifikation erzeugen. Klassische Ansätze, einen ordinal skalierten Endpunkt direkt zu modellieren, wie beispielsweise mit einem kumulativen Linkmodell, lassen sich typischerweise nicht auf hochdimensionale Daten anwenden. Wir präsentieren in dieser Arbeit hierarchical twoing (hi2) als einen Algorithmus für die Klassifikation hochdimensionler Daten in ordinal Skalierte Kategorien. hi2 nutzt die Mächtigkeit der sehr gut verstandenen binären Klassifikation, um auch in ordinale Kategorien zu klassifizieren. Eine Opensource-Implementierung von hi2 ist online verfügbar. In einer Vergleichsstudie zur Klassifikation von echten wie von simulierten Daten mit ordinalem Endpunkt produzieren etablierte Methoden, die speziell für geordnete Kategorien entworfen wurden, nicht generell bessere Ergebnisse als state-of-the-art nicht-ordinale Klassifikatoren. Die Fähigkeit eines Algorithmus, mit hochdimensionalen Daten umzugehen, dominiert die Klassifikationsleisting. Wir zeigen, dass unser Algorithmus hi2 konsistent gute Ergebnisse erzielt und in vielen Fällen besser abschneidet als die anderen Methoden
    corecore