182 research outputs found

    Analysing success criteria for ICT projects

    Full text link

    Using Sensitivity as a Method for Ranking the Test Cases Classified by Binary Decision Trees

    Get PDF
    Usually, data mining projects that are based on decision trees for classifying test cases will use the probabilities provided by these decision trees for ranking classified test cases. We have a need for a better method for ranking test cases that have already been classified by a binary decision tree because these probabilities are not always accurate and reliable enough. A reason for this is that the probability estimates computed by existing decision tree algorithms are always the same for all the different cases in a particular leaf of the decision tree. This is only one reason why the probability estimates given by decision tree algorithms can not be used as an accurate means of deciding if a test case has been correctly classified. Isabelle Alvarez has proposed a new method that could be used to rank the test cases that were classified by a binary decision tree [Alvarez, 2004]. In this paper we will give the results of a comparison of different ranking methods that are based on the probability estimate, the sensitivity of a particular case or both

    Defining Interestigness for Association Rules

    Get PDF
    Interestingness in Association Rules has been a major topic of research in the past decade. The reason is that the strength of association rules, i.e. its ability to discover ALL patterns given some thresholds on support and confidence, is also its weakness. Indeed, a typical association rules analysis on real data often results in hundreds or thousands of patterns creating a data mining problem of the second order. In other words, it is not straightforward to determine which of those rules are interesting for the end-user. This paper provides an overview of some existing measures of interestingness and we will comment on their properties. In general, interestingness measures can be divided into objective and subjective measures. Objective measures tend to express interestingness by means of statistical or mathematical criteria, whereas subjective measures of interestingness aim at capturing more practical criteria that should be taken into account, such as unexpectedness or actionability of rules. This paper only focusses on objective measures of interestingness

    A framework for internal fraud risk reduction at it integrating business processes : the IFR² framework

    Get PDF
    Fraud is a million dollar business and it is increasing every year. Both internal and external fraud present a substantial cost to our economy worldwide. A review of the academic literature learns that the academic community only addresses external fraud and how to detect this type of fraud. Little or no effort to our knowledge has been put in investigating how to prevent ánd to detect internal fraud, which we call ‘internal fraud risk reduction’. Taking together the urge for research in internal fraud and the lack of it in academic literature, research to reduce internal fraud risk is pivotal. Only after having a framework in which to implement empirical research, this topic can further be investigated. In this paper we present the IFR² framework, deduced from both the academic literature and from current business practices, where the core of this framework suggests to use a data mining approach.El fraude es un negocio millonario y está aumentando cada año. Tanto el fraude interno como el externo presentan un coste considerable para nuestra economía en todo el mundo. Este artículo sobre la literatura académica enseña que la comunidad académica solo se dirige al fraude externo, y cómo se detecta este tipo de fraude. Que sepamos, se ha hecho poco o ningún esfuerzo en investigar cómo evitar y detectar el fraude interno, al que llamamos ‘reducción del riesgo de fraude interno’. Teniendo en cuenta la urgencia de investigar el fraude interno, y la ausencia de ello en la literatura académica, la investigación para reducir este tipo de fraude es esencial. Este tema puede ser aún investigado con mayor profundidad solo después de tener un marco, en el que implementar investigación empírica. En este artículo, presentamos el marco IFR, deducido tanto de la literatura académica como de las prácticas empresariales actuales, donde el foco del marco sugiere usar un enfoque de extracción de datos

    Classifier PGN: Classification with High Confidence Rules

    Get PDF
    ACM Computing Classification System (1998): H.2.8, H.3.3.Associative classifiers use a set of class association rules, generated from a given training set, to classify new instances. Typically, these techniques set a minimal support to make a first selection of appropriate rules and discriminate subsequently between high and low quality rules by means of a quality measure such as confidence. As a result, the final set of class association rules have a support equal or greater than a predefined threshold, but many of them have confidence levels below 100%. PGN is a novel associative classifier which turns the traditional approach around and uses a confidence level of 100% as a first selection criterion, prior to maximizing the support. This article introduces PGN and evaluates the strength and limitations of PGN empirically. The results are promising and show that PGN is competitive with other well-known classifiers

    SEMANTIC AND ABSTRACTION CONTENT OF ART IMAGES

    Get PDF
    In this paper the semantic and abstraction content of art images is studied. Different techniques for search in art image repositories are analyzed and new ones are proposed. The content-based retrieval process integrates the search on different components, linked in XML structures. Some experiments over 200 paintings of six Israel contemporary artists are done and analyzed

    Measuring Implicit Bias Using SHAP Feature Importance and Fuzzy Cognitive Maps

    Get PDF
    In this paper, we integrate the concepts of feature importance with implicit bias in the context of pattern classification. This is done by means of a three-step methodology that involves (i) building a classifier and tuning its hyperparameters, (ii) building a Fuzzy Cognitive Map model able to quantify implicit bias, and (iii) using the SHAP feature importance to active the neural concepts when performing simulations. The results using a real case study concerning fairness research support our two-fold hypothesis. On the one hand, it is illustrated the risks of using a feature importance method as an absolute tool to measure implicit bias. On the other hand, it is concluded that the amount of bias towards protected features might differ depending on whether the features are numerically or categorically encoded

    Online learning of windmill time series using Long Short-term Cognitive Networks

    Full text link
    Forecasting windmill time series is often the basis of other processes such as anomaly detection, health monitoring, or maintenance scheduling. The amount of data generated on windmill farms makes online learning the most viable strategy to follow. Such settings require retraining the model each time a new batch of data is available. However, update the model with the new information is often very expensive to perform using traditional Recurrent Neural Networks (RNNs). In this paper, we use Long Short-term Cognitive Networks (LSTCNs) to forecast windmill time series in online settings. These recently introduced neural systems consist of chained Short-term Cognitive Network blocks, each processing a temporal data chunk. The learning algorithm of these blocks is based on a very fast, deterministic learning rule that makes LSTCNs suitable for online learning tasks. The numerical simulations using a case study with four windmills showed that our approach reported the lowest forecasting errors with respect to a simple RNN, a Long Short-term Memory, a Gated Recurrent Unit, and a Hidden Markov Model. What is perhaps more important is that the LSTCN approach is significantly faster than these state-of-the-art models
    • …
    corecore