Search CORE

7,926 research outputs found

Intelligent Financial Fraud Detection Practices: An Investigation

Author: B Bai
B Hoogs
C Holton
C Whitrow
CA Paasch
D Sánchez
D Zhang
E Duman
E Kirkos
E Ngai
FH Glancy
HC Koh
I Yeh
J Pinquet
JE Sohl
JT Quah
L Bermúdez
M Cecchini
M Jans
P Ravisankar
S Bhattacharyya
S Panigrahi
S Viaene
SL Humpherys
V Vatsa
W Zhou
W-S Yang
Publication venue
Publication date: 24/10/2015
Field of study

Financial fraud is an issue with far reaching consequences in the finance industry, government, corporate sectors, and for ordinary consumers. Increasing dependence on new technologies such as cloud and mobile computing in recent years has compounded the problem. Traditional methods of detection involve extensive use of auditing, where a trained individual manually observes reports or transactions in an attempt to discover fraudulent behaviour. This method is not only time consuming, expensive and inaccurate, but in the age of big data it is also impractical. Not surprisingly, financial institutions have turned to automated processes using statistical and computational methods. This paper presents a comprehensive investigation on financial fraud detection practices using such data mining methods, with a particular focus on computational intelligence-based techniques. Classification of the practices based on key aspects such as detection algorithm used, fraud type investigated, and success rate have been covered. Issues and challenges associated with the current practices and potential future direction of research have also been identified.Comment: Proceedings of the 10th International Conference on Security and Privacy in Communication Networks (SecureComm 2014

arXiv.org e-Print Archive

Crossref

A Survey of Parallel Data Mining

Author: Freitas Alex A.
Publication venue
Publication date
Field of study

With the fast, continuous increase in the number and size of databases, parallel data mining is a natural and cost-effective approach to tackle the problem of scalability in data mining. Recently there has been a considerable research on parallel data mining. However, most projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This paper surveys parallel data mining with a broader perspective. More precisely, we discuss the parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule induction, instance-based learning, genetic algorithms and neural networks. Using the lessons learned from this discussion, we also derive a set of heuristic principles for designing efficient parallel data mining algorithms

Kent Academic Repository

Data Mining by Soft Computing Methods for The Coronary Heart Disease Database

Author: Hara Akira
Ichimura Takumi
Publication venue: IEEE SMC Hiroshima Chapter
Publication date: 01/12/2008
Field of study

For improvement of data mining technology, the advantages and disadvantages on respective data mining methods should be discussed by comparison under the same condition. For this purpose, the Coronary Heart Disease database (CHD DB) was developed in 2004, and the data mining competition was held in the International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES). In the competition, two methods based on soft computing were presented. In this paper, we report the overview of the CHD DB and the soft computing methods, and discuss the features of respective methods by comparison of the experimental results

Hiroshima University Institutional Repository

Okayama University Scientific Achievement Repository

A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

Author: Adibi N
Ahmadzadeh MR
Barati E
Mohammadi A
Saraee MH
Publication venue: Cyber Journals
Publication date: 01/03/2011
Field of study

Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

University of Salford Institutional Repository

A Process to Implement an Artificial Neural Network and Association Rules Techniques to Improve Asset Performance and Energy Efficiency

Author: Antomarioni Sara
Crespo Márquez Adolfo
Fuente Antonio de la
Publication venue: 'MDPI AG'
Publication date: 01/09/2019
Field of study

In this paper, we address the problem of asset performance monitoring, with the intention of both detecting any potential reliability problem and predicting any loss of energy consumption e ciency. This is an important concern for many industries and utilities with very intensive capitalization in very long-lasting assets. To overcome this problem, in this paper we propose an approach to combine an Artificial Neural Network (ANN) with Data Mining (DM) tools, specifically with Association Rule (AR) Mining. The combination of these two techniques can now be done using software which can handle large volumes of data (big data), but the process still needs to ensure that the required amount of data will be available during the assets’ life cycle and that its quality is acceptable. The combination of these two techniques in the proposed sequence di ers from previous works found in the literature, giving researchers new options to face the problem. Practical implementation of the proposed approach may lead to novel predictive maintenance models (emerging predictive analytics) that may detect with unprecedented precision any asset’s lack of performance and help manage assets’ O&M accordingly. The approach is illustrated using specific examples where asset performance monitoring is rather complex under normal operational conditions.Ministerio de Economía y Competitividad DPI2015-70842-

Multidisciplinary Digital Publishing Institute

idUS. Depósito de Investigación Universidad de Sevilla

Prediction of fly-rock using gene expression programming and teaching– learning-based optimization algorithm

Author: Amini Mohammad Saeed
Bascompta Massanes Marc
Dehghani Hesam
Entezam Shima
Shamsi Reza
Shokri Behshad Jodeiri
Publication venue
Publication date: 01/04/2022
Field of study

Peer ReviewedObjectius de Desenvolupament Sostenible::12 - Producció i Consum ResponsablesObjectius de Desenvolupament Sostenible::12 - Producció i Consum Responsables::12.2 - Per a 2030, assolir la gestió sostenible i l’ús eficient dels recursos naturalsPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Recommended from our members

State-of-the-art on research and applications of machine learning in the building life cycle

Author: Hong T
Luo X
Wang Z
Zhang W
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science

eScholarship - University of California

Application of Computational Intelligence Techniques to Process Industry Problems

Author: Gabrys Bogdan
Kadlec Petr
Publication venue: EXIT Publishing House
Publication date: 01/01/2008
Field of study

In the last two decades there has been a large progress in the computational intelligence research field. The fruits of the effort spent on the research in the discussed field are powerful techniques for pattern recognition, data mining, data modelling, etc. These techniques achieve high performance on traditional data sets like the UCI machine learning database. Unfortunately, this kind of data sources usually represent clean data without any problems like data outliers, missing values, feature co-linearity, etc. common to real-life industrial data. The presence of faulty data samples can have very harmful effects on the models, for example if presented during the training of the models, it can either cause sub-optimal performance of the trained model or in the worst case destroy the so far learnt knowledge of the model. For these reasons the application of present modelling techniques to industrial problems has developed into a research field on its own. Based on the discussion of the properties and issues of the data and the state-of-the-art modelling techniques in the process industry, in this paper a novel unified approach to the development of predictive models in the process industry is presented

Bournemouth University Research Online

Evolutionary Computation and QSAR Research

Author: Aguiar-Pulido Vanessa
Cruz-Monteagudo Maykel
Dorado Julián
Gestal M.
Munteanu Cristian-Robert
Rabuñal Juan R.
Publication venue: 'Bentham Science Publishers Ltd.'
Publication date: 01/01/2013
Field of study

[Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. Consellería de Economía e Industria; 10SIN105004P

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas