2,891 research outputs found
Data Mining Techniques for Fraud Detection
The paper presents application of data mining techniques to fraud analysis. We present some classification and prediction data mining techniques which we consider important to handle fraud detection. There exist a number of data mining algorithms and we present statistics-based algorithm, decision tree-based algorithm and rule-based algorithm. We present Bayesian classification model to detect fraud in automobile insurance. Naïve Bayesian visualization is selected to analyze and interpret the classifier predictions. We illustrate how ROC curves can be deployed for model assessment in order to provide a more intuitive analysis of the models.
Keywords: Data Mining, Decision Tree, Bayesian Network, ROC Curve, Confusion Matri
Data Mining Techniques in Fraud Detection
The paper presents application of data mining techniques to fraud analysis. We present some classification and prediction data mining techniques which we consider important to handle fraud detection. There exist a number of data mining algorithms and we present statistics-based algorithm, decision treebased algorithm and rule-based algorithm. We present Bayesian classification model to detect fraud in automobile insurance. Naïve Bayesian visualization is selected to analyze and interpret the classifier predictions. We illustrate how ROC curves can be deployed for model assessment in order to provide a more intuitive analysis of the models
Computational strategies for dissecting the high-dimensional complexity of adaptive immune repertoires
The adaptive immune system recognizes antigens via an immense array of
antigen-binding antibodies and T-cell receptors, the immune repertoire. The
interrogation of immune repertoires is of high relevance for understanding the
adaptive immune response in disease and infection (e.g., autoimmunity, cancer,
HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the
quantitative and molecular-level profiling of immune repertoires thereby
revealing the high-dimensional complexity of the immune receptor sequence
landscape. Several methods for the computational and statistical analysis of
large-scale AIRR-seq data have been developed to resolve immune repertoire
complexity in order to understand the dynamics of adaptive immunity. Here, we
review the current research on (i) diversity, (ii) clustering and network,
(iii) phylogenetic and (iv) machine learning methods applied to dissect,
quantify and compare the architecture, evolution, and specificity of immune
repertoires. We summarize outstanding questions in computational immunology and
propose future directions for systems immunology towards coupling AIRR-seq with
the computational discovery of immunotherapeutics, vaccines, and
immunodiagnostics.Comment: 27 pages, 2 figure
Algoritmos Evolutivos para Descubrimiento de Reglas de Predicción en la Mejora de Sistemas Educativos Adaptativos basados en Web
Este artículo muestra la utilización de los algoritmos evolutivos para el descubrimiento de
reglas de predicción que se utilizarán en la mejora de Cursos Hipermedia Adaptativos basados en Web. Se
ha desarrollado una herramienta de minería de datos específica para descubrir relaciones entre los datos de
utilización recogidos durante las ejecuciones de los distintos alumnos. Esta información puede ser de gran
utilidad para el profesor o autor del curso, para la toma de decisiones sobre qué modificaciones son las más
adecuadas para mejorar el aprendizaje de los alumnos. Para la realización de la búsqueda de reglas de
predicción se ha utilizado programación genética basada en gramáticas multi-objetivo y se han comparado
con algoritmos clásicos de descubrimiento de reglas.In this paper we show the use of evolutionary algorithms for discovering prediction rules to
improve web-based adaptive hypermedia courses. We have developed a specific data mining tool to
discover relationship between the usage data pickup during the execution of different students. This
information can be very useful to the courseware author in order to make decisions about what are the most
appropriated modifications to improve the learning of the students. In order to do prediction rule
discovering we have used multi-objective grammar-based genetic programming and we have compared it
with other classic algorithm for rule discovering
A Comprehensive Survey of Data Mining-based Fraud Detection Research
This survey paper categorises, compares, and summarises from almost all
published technical and review articles in automated fraud detection within the
last 10 years. It defines the professional fraudster, formalises the main types
and subtypes of known fraud, and presents the nature of data evidence collected
within affected industries. Within the business context of mining the data to
achieve higher cost savings, this research presents methods and techniques
together with their problems. Compared to all related reviews on fraud
detection, this survey covers much more technical articles and is the only one,
to the best of our knowledge, which proposes alternative data and solutions
from related domains.Comment: 14 page
Semantically-guided evolutionary knowledge discovery from texts
This thesis proposes a new approach for structured knowledge discovery from texts
which considers both the mining process itself, the evaluation of this knowledge by the
model, and the human assessment of the quality of the outcome.This is achieved by integrating Natural-Language technology and Genetic Algorithms to produce explanatory novel hypotheses. Natural-Language techniques are
specifically used to extract genre-based information from text documents. Additional
semantic and rhetorical information for generating training data and for feeding a semistructured Latent Semantic Analysis process is also captured.The discovery process is modeled by a semantically-guided Genetic Algorithm
which uses training data to guide the search and optimization process. A number of
novel criteria to evaluate the quality of the new knowledge are proposed. Consequently,
new genetic operations suitable for text mining are designed, and techniques for Evolutionary Multi-Objective Optimization are adapted for the model to trade off between
different criteria in the hypotheses.Domain experts were used in an experiment to assess the quality of the hypotheses
produced by the model so as to establish their effectiveness in terms of novel and
interesting knowledge. The assessment showed encouraging results for the discovered
knowledge and for the correlation between the model and the human opinions
- …