23,446 research outputs found

    Towards a Comprehensible and Accurate Credit Management Model: Application of four Computational Intelligence Methodologies

    Get PDF
    The paper presents methods for classification of applicants into different categories of credit risk using four different computational intelligence techniques. The selected methodologies involved in the rule-based categorization task are (1) feedforward neural networks trained with second order methods (2) inductive machine learning, (3) hierarchical decision trees produced by grammar-guided genetic programming and (4) fuzzy rule based systems produced by grammar-guided genetic programming. The data used are both numerical and linguistic in nature and they represent a real-world problem, that of deciding whether a loan should be granted or not, in respect to financial details of customers applying for that loan, to a specific private EU bank. We examine the proposed classification models with a sample of enterprises that applied for a loan, each of which is described by financial decision variables (ratios), and classified to one of the four predetermined classes. Attention is given to the comprehensibility and the ease of use for the acquired decision models. Results show that the application of the proposed methods can make the classification task easier and - in some cases - may minimize significantly the amount of required credit data. We consider that these methodologies may also give the chance for the extraction of a comprehensible credit management model or even the incorporation of a related decision support system in bankin

    Temporal Feature Selection with Symbolic Regression

    Get PDF
    Building and discovering useful features when constructing machine learning models is the central task for the machine learning practitioner. Good features are useful not only in increasing the predictive power of a model but also in illuminating the underlying drivers of a target variable. In this research we propose a novel feature learning technique in which Symbolic regression is endowed with a ``Range Terminal\u27\u27 that allows it to explore functions of the aggregate of variables over time. We test the Range Terminal on a synthetic data set and a real world data in which we predict seasonal greenness using satellite derived temperature and snow data over a portion of the Arctic. On the synthetic data set we find Symbolic regression with the Range Terminal outperforms standard Symbolic regression and Lasso regression. On the Arctic data set we find it outperforms standard Symbolic regression, fails to beat the Lasso regression, but finds useful features describing the interaction between Land Surface Temperature, Snow, and seasonal vegetative growth in the Arctic

    A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

    Get PDF
    Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

    Investment Opportunities Forecasting: Extending the Grammar of a GP-based Tool

    Get PDF
    In this paper we present a new version of a GP financial forecasting tool, called EDDIE 8. The novelty of this version is that it allows the GP to search in the space of indicators, instead of using pre-specified ones. We compare EDDIE 8 with its predecessor, EDDIE 7, and find that new and improved solutions can be found. Analysis also shows that, on average, EDDIE 8's best tree performs better than the one of EDDIE 7. The above allows us to characterize EDDIE 8 as a valuable forecasting tool

    Grammar-Guided Genetic Programming For Fuzzy Rule-Based Classification in Credit Management

    Get PDF

    A new approach for discovering business process models from event logs.

    Get PDF
    Process mining is the automated acquisition of process models from the event logs of information systems. Although process mining has many useful applications, not all inherent difficulties have been sufficiently solved. A first difficulty is that process mining is often limited to a setting of non-supervised learnings since negative information is often not available. Moreover, state transitions in processes are often dependent on the traversed path, which limits the appropriateness of search techniques based on local information in the event log. Another difficulty is that case data and resource properties that can also influence state transitions are time-varying properties, such that they cannot be considered ascross-sectional.This article investigates the use of first-order, ILP classification learners for process mining and describes techniques for dealing with each of the above mentioned difficulties. To make process mining a supervised learning task, we propose to include negative events in the event log. When event logs contain no negative information, a technique is described to add artificial negative examples to a process log. To capture history-dependent behavior the article proposes to take advantage of the multi-relational nature of ILP classification learners. Multi-relational process mining allows to search for patterns among multiple event rows in the event log, effectively basing its search on global information. To deal with time-varying case data and resource properties, a closed-world version of the Event Calculus has to be added as background knowledge, transforming the event log effectively in a temporal database. First experiments on synthetic event logs show that first-order classification learners are capable of predicting the behavior with high accuracy, even under conditions of noise.Credit; Credit scoring; Models; Model; Applications; Performance; Space; Decision; Yield; Real life; Risk; Evaluation; Rules; Neural networks; Networks; Classification; Research; Business; Processes; Event; Information; Information systems; Systems; Learning; Data; Behavior; Patterns; IT; Event calculus; Knowledge; Database; Noise;

    21st Century Simulation: Exploiting High Performance Computing and Data Analysis

    Get PDF
    This paper identifies, defines, and analyzes the limitations imposed on Modeling and Simulation by outmoded paradigms in computer utilization and data analysis. The authors then discuss two emerging capabilities to overcome these limitations: High Performance Parallel Computing and Advanced Data Analysis. First, parallel computing, in supercomputers and Linux clusters, has proven effective by providing users an advantage in computing power. This has been characterized as a ten-year lead over the use of single-processor computers. Second, advanced data analysis techniques are both necessitated and enabled by this leap in computing power. JFCOM's JESPP project is one of the few simulation initiatives to effectively embrace these concepts. The challenges facing the defense analyst today have grown to include the need to consider operations among non-combatant populations, to focus on impacts to civilian infrastructure, to differentiate combatants from non-combatants, and to understand non-linear, asymmetric warfare. These requirements stretch both current computational techniques and data analysis methodologies. In this paper, documented examples and potential solutions will be advanced. The authors discuss the paths to successful implementation based on their experience. Reviewed technologies include parallel computing, cluster computing, grid computing, data logging, OpsResearch, database advances, data mining, evolutionary computing, genetic algorithms, and Monte Carlo sensitivity analyses. The modeling and simulation community has significant potential to provide more opportunities for training and analysis. Simulations must include increasingly sophisticated environments, better emulations of foes, and more realistic civilian populations. Overcoming the implementation challenges will produce dramatically better insights, for trainees and analysts. High Performance Parallel Computing and Advanced Data Analysis promise increased understanding of future vulnerabilities to help avoid unneeded mission failures and unacceptable personnel losses. The authors set forth road maps for rapid prototyping and adoption of advanced capabilities. They discuss the beneficial impact of embracing these technologies, as well as risk mitigation required to ensure success
    corecore