1,656 research outputs found
Learning Interpretable Rules for Multi-label Classification
Multi-label classification (MLC) is a supervised learning problem in which,
contrary to standard multiclass classification, an instance can be associated
with several class labels simultaneously. In this chapter, we advocate a
rule-based approach to multi-label classification. Rule learning algorithms are
often employed when one is not only interested in accurate predictions, but
also requires an interpretable theory that can be understood, analyzed, and
qualitatively evaluated by domain experts. Ideally, by revealing patterns and
regularities contained in the data, a rule-based theory yields new insights in
the application domain. Recently, several authors have started to investigate
how rule-based models can be used for modeling multi-label data. Discussing
this task in detail, we highlight some of the problems that make rule learning
considerably more challenging for MLC than for conventional classification.
While mainly focusing on our own previous work, we also provide a short
overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models
in Computer Vision and Machine Learning. The Springer Series on Challenges in
Machine Learning. Springer (2018). See
http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further
informatio
Discovering Higher-order SNP Interactions in High-dimensional Genomic Data
In this thesis, a multifactor dimensionality reduction based method on associative classification is employed to identify higher-order SNP interactions for enhancing the understanding of the genetic architecture of complex diseases. Further, this thesis explored the application of deep learning techniques by providing new clues into the interaction analysis. The performance of the deep learning method is maximized by unifying deep neural networks with a random forest for achieving reliable interactions in the presence of noise
Data mining in manufacturing: a review based on the kind of knowledge
In modern manufacturing environments, vast amounts of data are collected in database management systems and data warehouses from all involved areas, including product and process design, assembly, materials planning, quality control, scheduling, maintenance, fault detection etc. Data mining has emerged as an important tool for knowledge acquisition from the manufacturing databases. This paper reviews the literature dealing with knowledge discovery and data mining applications in the broad domain of manufacturing with a special emphasis on the type of functions to be performed on the data. The major data mining functions to be performed include characterization and description, association, classification, prediction, clustering and evolution analysis. The papers reviewed have therefore been categorized in these five categories. It has been shown that there is a rapid growth in the application of data mining in the context of manufacturing processes and enterprises in the last 3 years. This review reveals the progressive applications and existing gaps identified in the context of data mining in manufacturing. A novel text mining approach has also been used on the abstracts and keywords of 150 papers to identify the research gaps and find the linkages between knowledge area, knowledge type and the applied data mining tools and techniques
Explainable Artificial Intelligence and Causal Inference based ATM Fraud Detection
Gaining the trust of customers and providing them empathy are very critical
in the financial domain. Frequent occurrence of fraudulent activities affects
these two factors. Hence, financial organizations and banks must take utmost
care to mitigate them. Among them, ATM fraudulent transaction is a common
problem faced by banks. There following are the critical challenges involved in
fraud datasets: the dataset is highly imbalanced, the fraud pattern is
changing, etc. Owing to the rarity of fraudulent activities, Fraud detection
can be formulated as either a binary classification problem or One class
classification (OCC). In this study, we handled these techniques on an ATM
transactions dataset collected from India. In binary classification, we
investigated the effectiveness of various over-sampling techniques, such as the
Synthetic Minority Oversampling Technique (SMOTE) and its variants, Generative
Adversarial Networks (GAN), to achieve oversampling. Further, we employed
various machine learning techniques viz., Naive Bayes (NB), Logistic Regression
(LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF),
Gradient Boosting Tree (GBT), Multi-layer perceptron (MLP). GBT outperformed
the rest of the models by achieving 0.963 AUC, and DT stands second with 0.958
AUC. DT is the winner if the complexity and interpretability aspects are
considered. Among all the oversampling approaches, SMOTE and its variants were
observed to perform better. In OCC, IForest attained 0.959 CR, and OCSVM
secured second place with 0.947 CR. Further, we incorporated explainable
artificial intelligence (XAI) and causal inference (CI) in the fraud detection
framework and studied it through various analyses.Comment: 34 pages; 21 Figures; 8 Table
Applications of Artificial Intelligence in Power Systems
Artificial intelligence tools, which are fast, robust and adaptive can overcome the drawbacks of traditional solutions for several power systems problems. In this work, applications of AI techniques have been studied for solving two important problems in power systems.
The first problem is static security evaluation (SSE). The objective of SSE is to identify the contingencies in planning and operations of power systems. Numerical conventional solutions are time-consuming, computationally expensive, and are not suitable for online applications. SSE may be considered as a binary-classification, multi-classification or regression problem. In this work, multi-support vector machine is combined with several evolutionary computation algorithms, including particle swarm optimization (PSO), differential evolution, Ant colony optimization for the continuous domain, and harmony search techniques to solve the SSE. Moreover, support vector regression is combined with modified PSO with a proposed modification on the inertia weight in order to solve the SSE.
Also, the correct accuracy of classification, the speed of training, and the final cost of using power equipment heavily depend on the selected input features. In this dissertation, multi-object PSO has been used to solve this problem. Furthermore, a multi-classifier voting scheme is proposed to get the final test output. The classifiers participating in the voting scheme include multi-SVM with different types of kernels and random forests with an adaptive number of trees. In short, the development and performance of different machine learning tools combined with evolutionary computation techniques have been studied to solve the online SSE. The performance of the proposed techniques is tested on several benchmark systems, namely the IEEE 9-bus, 14-bus, 39-bus, 57-bus, 118-bus, and 300-bus power systems.
The second problem is the non-convex, nonlinear, and non-differentiable economic dispatch (ED) problem. The purpose of solving the ED is to improve the cost-effectiveness of power generation. To solve ED with multi-fuel options, prohibited operating zones, valve point effect, and transmission line losses, genetic algorithm (GA) variant-based methods, such as breeder GA, fast navigating GA, twin removal GA, kite GA, and United GA are used. The IEEE systems with 6-units, 10-units, and 15-units are used to study the efficiency of the algorithms
Applications of Artificial Intelligence in Power Systems
Artificial intelligence tools, which are fast, robust and adaptive can overcome the drawbacks of traditional solutions for several power systems problems. In this work, applications of AI techniques have been studied for solving two important problems in power systems.
The first problem is static security evaluation (SSE). The objective of SSE is to identify the contingencies in planning and operations of power systems. Numerical conventional solutions are time-consuming, computationally expensive, and are not suitable for online applications. SSE may be considered as a binary-classification, multi-classification or regression problem. In this work, multi-support vector machine is combined with several evolutionary computation algorithms, including particle swarm optimization (PSO), differential evolution, Ant colony optimization for the continuous domain, and harmony search techniques to solve the SSE. Moreover, support vector regression is combined with modified PSO with a proposed modification on the inertia weight in order to solve the SSE.
Also, the correct accuracy of classification, the speed of training, and the final cost of using power equipment heavily depend on the selected input features. In this dissertation, multi-object PSO has been used to solve this problem. Furthermore, a multi-classifier voting scheme is proposed to get the final test output. The classifiers participating in the voting scheme include multi-SVM with different types of kernels and random forests with an adaptive number of trees. In short, the development and performance of different machine learning tools combined with evolutionary computation techniques have been studied to solve the online SSE. The performance of the proposed techniques is tested on several benchmark systems, namely the IEEE 9-bus, 14-bus, 39-bus, 57-bus, 118-bus, and 300-bus power systems.
The second problem is the non-convex, nonlinear, and non-differentiable economic dispatch (ED) problem. The purpose of solving the ED is to improve the cost-effectiveness of power generation. To solve ED with multi-fuel options, prohibited operating zones, valve point effect, and transmission line losses, genetic algorithm (GA) variant-based methods, such as breeder GA, fast navigating GA, twin removal GA, kite GA, and United GA are used. The IEEE systems with 6-units, 10-units, and 15-units are used to study the efficiency of the algorithms
- …