77 research outputs found
Recommended from our members
Cost-based Modeling for Fraud and Intrusion Detection: Results from the JAM Project
We describe the results achieved using the JAM distributed data mining system for the real world problem of fraud detection in financial information systems. For this domain we provide clear evidence that state-of-the-art commercial fraud detection systems can be substantially improved in stopping losses due to fraud by combining multiple models of fraudulent transaction shared among banks. We demonstrate that the traditional statistical metrics used to train and evaluate the performance of learning systems (i.e. statistical accuracy or ROC analysis) are misleading and perhaps inappropriate for this application. Cost-based metrics are more relevant in certain domains, and defining such metrics poses significant and interesting research questions both in evaluating systems and alternative models, and in formalizing the problems to which one may wish to apply data mining technologies. This paper also demonstrates how the techniques developed for fraud detection can be generalized and applied to the important area of intrusion detection in networked information systems. We report the outcome of recent evaluations of our system applied to tcpdump network intrusion data specifically with respect to statistical accuracy. This work involved building additional components of JAM that we have come to call, MADAM ID (Mining Audit Data for Automated Models for Intrusion Detection). However, taking the next step to define cost-based models for intrusion detection poses interesting new research questions. We describe our initial ideas about how to evaluate intrusion detection systems using cost models learned during our work on fraud detection
Intelligent and Distributed Data Warehouse for Studentâs Academic Performance Analysis
In the academic world, a large amount of data is handled each day, ranging from studentâs assessments to their socio-economic data. In order to analyze this historical information, an interesting alternative is to implement a Data Warehouse. However, Data Warehouses are not able to perform predictive analysis by themselves, so machine intelligence techniques can be used for sorting, grouping, and predicting based on historical information to improve the analysis quality. This work describes a Data Warehouse architecture to carry out an academic performance analysis of students
The contribution of dance on children's health
Introduction: Dance is a kind of art therapy involving the psychotherapeutic use of expressive movement through which children can engage creatively in the process of personal development. Purpose: To highlight the contribution of dance to children psychophysical development and their self-expression of personality. Materials and methods: The research method consisted of reviewing articles addressing dance's role in children's psychophysical development and self-expression of personality found mostly via Medline, the Hellenic Academic Libraries Link and Google Scholar. A search of classic scientific literature and studies in libraries was also conducted. All articles had to be written in either Greek or English and refer to dance. Results: Dance is a treatment procedure commonly used at schools as an educational means. It is an important effective tool for children who suffer from emotional disorders and learning disabilities and aims to increase children's self-esteem, emotional expression, and ability to complete tasks relaxation, social interaction and coherence of the group in which they participate. Dance also helps children both to manage emotions that impede learning and to improve their adaptability in school. Conclusions: Dance develops children's the expressive ability and help them to express themselves not only verbally but also bodily
Effective and Efficient Pruning of Meta-Classifiers in a Distributed Data Mining System
Distributed data mining systems aim to discover and combine useful information that is distributed across multiple databases. One of the main challenges is the design of effective and efficient methods to combine multiple models computed over multiple distributed sources that scale well over many large distributed databases. We describe in detail several methods that evaluate, prune and combine large collections of imported models computed at remote sites into efficient and scalable meta-classifiers. We demonstrate and evaluate the pruning methods by detailing many experiments performed on actual credit card data sets supplied by collaborating financial institutions, where the target learning task is fraud detection. We show that pruned meta-classifiers can sustain or even improve predictive performance at a substantially higher throughput, compared to the unpruned meta-classifiers
Pruning Classifiers in a Distributed Meta-Learning System
JAM is a powerful and portable agent-based distributed data mining system that employs meta-learning techniques to integrate a number of independent classifiers (concepts) derived in parallel from independent and (possibly) inherently distributed databases. Although metalearning promotes scalability and accuracy in a simple and straightforward manner, brute force meta-learning techniques can result in large, inefficient and some times inaccurate meta-classifier hierarchies. In this paper we explore several techniques for evaluating classifiers and we demonstrate that meta-learning combined with certain pruning methods can achieve similar or even better performance results in a much more cost effective manner. Keywords: classifier evaluation, pruning, metrics, distributed mining, meta-learning. This research is supported by the Intrusion Detection Program (BAA9603) from DARPA (F30602-96-1-0311), NSF (IRI-96-32225 and CDA-96-25374) and NYSSTF (423115-445). y Supported in part by IBM..
- âŠ