4 research outputs found
A Comparison of Decision Tree Classifiers for Automatic Diagnosis of Speech Recognition Errors
Present speech recognition systems are becoming more complex due to technology advances, optimizations and special requirements such as small computation and memory footprints. Proper handling of system failures can be seen as a kind of fault diagnosis. Motivated by the success of decision tree diagnosis in other scientific fields and by their successful application in speech recognition in the last decade, we contribute to the topic mainly in terms of comparison of different types of decision trees. Five styles are examined: CART (testing three different splitting criteria), C4.5, and then Minimum Message Length (MML), strict MML and Bayesian styles decision trees. We apply these techniques to data of computer speech recognition fed by intrinsically variable speech. We conclude that for this task, CART technique outperforms C4.5 in terms of better classification for ASR failures
New Graphical Model for Computing Optimistic Decisions in Possibility Theory Framework
This paper first proposes a new graphical model for decision making under uncertainty based on min-based possibilistic networks. A decision problem under uncertainty is described by means of two distinct min-based possibilistic networks: the first one expresses agent's knowledge while the second one encodes agent's preferences representing a qualitative utility. We then propose an efficient algorithm for computing optimistic optimal decisions using our new model for representing possibilistic decision making under uncertainty. We show that the computation of optimal decisions comes down to compute a normalization degree of the junction tree associated with the graph resulting from the fusion of agent's beliefs and preferences. This paper also proposes an alternative way for computing optimal optimistic decisions. The idea is to transform the two possibilistic networks into two equivalent possibilistic logic knowledge bases, one representing agent's knowledge and the other represents agent's preferences. We show that computing an optimal optimistic decision comes down to compute the inconsistency degree of the union of the two possibilistic bases augmented with a given decision
Decision tree classifiers for incident call data sets
Information technology (IT) has become one of the key technologies for economic and social development in any organization. Therefore the management of Information technology incidents, and particularly in the area of resolving the problem very fast, is of concern to Information technology managers. Delays can result when incorrect subjects are assigned to Information technology incident calls: because the person sent to remedy the problem has the wrong expertise or has not brought with them the software or hardware they need to help that user. In the case study used for this work, there are no management checks in place to verify the assigning of incident description subjects. This research aims to develop a method that will tackle the problem of wrongly assigned subjects for incident descriptions. In particular, this study explores the Information technology incident calls database of an oil and gas company as a case study. The approach was to explore the Information technology incident descriptions and their assigned subjects; thereafter the correctly-assigned records were used for training decision tree classification algorithms using Waikato Environment for Knowledge Analysis (WEKA) software. Finally, the records incorrectly assigned a subject by human operators were used for testing. The J48 algorithm gave the best performance and accuracy, and was able to correctly assign subjects to 81% of the records wrongly classified by human operators
Statistical Methods to Enhance Clinical Prediction with High-Dimensional Data and Ordinal Response
Der technologische Fortschritt ermöglicht es heute, die moleculare
Konfiguration einzelner Zellen oder ganzer Gewebeproben zu
untersuchen. Solche in großen Mengen produzierten
hochdimensionalen Omics-Daten aus der Molekularbiologie lassen sich
zu immer niedrigeren Kosten erzeugen und werden so immer
häufiger auch in klinischen Fragestellungen eingesetzt.
Personalisierte Diagnose oder auch die Vorhersage eines
Behandlungserfolges auf der Basis solcher Hochdurchsatzdaten stellen
eine moderne Anwendung von Techniken aus dem maschinellen Lernen dar.
In der Praxis werden klinische Parameter, wie etwa der
Gesundheitszustand oder die Nebenwirkungen einer Therapie, häufig auf
einer ordinalen Skala erhoben (beispielsweise gut, normal,
schlecht).
Es ist verbreitet, Klassifikationsproblme mit ordinal skaliertem
Endpunkt wie generelle Mehrklassenproblme zu behandeln und somit die
Information, die in der Ordnung zwischen den Klassen enthalten ist, zu
ignorieren. Allerdings kann das Vernachlässigen dieser Information zu
einer verminderten Klassifikationsgüte führen oder sogar eine
ungünstige ungeordnete Klassifikation erzeugen.
Klassische Ansätze, einen ordinal skalierten Endpunkt direkt zu
modellieren, wie beispielsweise mit einem kumulativen Linkmodell,
lassen sich typischerweise nicht auf hochdimensionale Daten anwenden.
Wir präsentieren in dieser Arbeit hierarchical twoing (hi2) als
einen Algorithmus für die Klassifikation hochdimensionler Daten in
ordinal Skalierte Kategorien. hi2 nutzt die Mächtigkeit der
sehr gut verstandenen binären Klassifikation, um auch in ordinale
Kategorien zu klassifizieren. Eine Opensource-Implementierung von
hi2 ist online verfügbar.
In einer Vergleichsstudie zur Klassifikation von echten wie von
simulierten Daten mit ordinalem Endpunkt produzieren etablierte
Methoden, die speziell für geordnete Kategorien entworfen wurden,
nicht generell bessere Ergebnisse als state-of-the-art
nicht-ordinale Klassifikatoren. Die Fähigkeit eines Algorithmus, mit
hochdimensionalen Daten umzugehen, dominiert die
Klassifikationsleisting. Wir zeigen, dass unser Algorithmus hi2
konsistent gute Ergebnisse erzielt und in vielen Fällen besser
abschneidet als die anderen Methoden