1,087 research outputs found

    Credit Risk Evaluation in Peer-to-peer Lending With Linguistic Data Transformation and Supervised Learning

    Get PDF
    The widespread availability of various peer-to-peer lending solutions is rapidly changing the landscape of ï¬nancial services. Beside the natural advantages over traditional services,a relevant problem in the domain is to correctly assess the risk associated with borrowers. In contrast to traditional ï¬nancial services industries, in peer-to-peer lending the unsecured nature of loans as well as the relative novelty of the platforms make the assessment of risk a difï¬cult problem. In this article we propose to use traditional machine learning methods enhanced with fuzzy set theory based transformation of data to improve the quality of identifying loans with high likelihood of default. We assess the proposed approach on a real-life dataset from one of the largest peer-to-peer platforms in Europe. The results demonstrate that (i) traditional classiï¬cation algorithms show good performance in classifying borrowers, and (ii) their performance can be improved using linguistic data transformatio

    Predictive modeling of die filling of the pharmaceutical granules using the flexible neural tree

    Get PDF
    In this work, a computational intelligence (CI) technique named flexible neural tree (FNT) was developed to predict die filling performance of pharmaceutical granules and to identify significant die filling process variables. FNT resembles feedforward neural network, which creates a tree-like structure by using genetic programming. To improve accuracy, FNT parameters were optimized by using differential evolution algorithm. The performance of the FNT-based CI model was evaluated and compared with other CI techniques: multilayer perceptron, Gaussian process regression, and reduced error pruning tree. The accuracy of the CI model was evaluated experimentally using die filling as a case study. The die filling experiments were performed using a model shoe system and three different grades of microcrystalline cellulose (MCC) powders (MCC PH 101, MCC PH 102, and MCC DG). The feed powders were roll-compacted and milled into granules. The granules were then sieved into samples of various size classes. The mass of granules deposited into the die at different shoe speeds was measured. From these experiments, a dataset consisting true density, mean diameter (d50), granule size, and shoe speed as the inputs and the deposited mass as the output was generated. Cross-validation (CV) methods such as 10FCV and 5x2FCV were applied to develop and to validate the predictive models. It was found that the FNT-based CI model (for both CV methods) performed much better than other CI models. Additionally, it was observed that process variables such as the granule size and the shoe speed had a higher impact on the predictability than that of the powder property such as d50. Furthermore, validation of model prediction with experimental data showed that the die filling behavior of coarse granules could be better predicted than that of fine granules

    Sistemas granulares evolutivos

    Get PDF
    Orientador: Fernando Antonio Campos GomideTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Recentemente tem-se observado um crescente interesse em abordagens de modelagem computacional para lidar com fluxos de dados do mundo real. Métodos e algoritmos têm sido propostos para obtenção de conhecimento a partir de conjuntos de dados muito grandes e, a princípio, sem valor aparente. Este trabalho apresenta uma plataforma computacional para modelagem granular evolutiva de fluxos de dados incertos. Sistemas granulares evolutivos abrangem uma variedade de abordagens para modelagem on-line inspiradas na forma com que os humanos lidam com a complexidade. Esses sistemas exploram o fluxo de informação em ambiente dinâmico e extrai disso modelos que podem ser linguisticamente entendidos. Particularmente, a granulação da informação é uma técnica natural para dispensar atenção a detalhes desnecessários e enfatizar transparência, interpretabilidade e escalabilidade de sistemas de informação. Dados incertos (granulares) surgem a partir de percepções ou descrições imprecisas do valor de uma variável. De maneira geral, vários fatores podem afetar a escolha da representação dos dados tal que o objeto representativo reflita o significado do conceito que ele está sendo usado para representar. Neste trabalho são considerados dados numéricos, intervalares e fuzzy; e modelos intervalares, fuzzy e neuro-fuzzy. A aprendizagem de sistemas granulares é baseada em algoritmos incrementais que constroem a estrutura do modelo sem conhecimento anterior sobre o processo e adapta os parâmetros do modelo sempre que necessário. Este paradigma de aprendizagem é particularmente importante uma vez que ele evita a reconstrução e o retreinamento do modelo quando o ambiente muda. Exemplos de aplicação em classificação, aproximação de função, predição de séries temporais e controle usando dados sintéticos e reais ilustram a utilidade das abordagens de modelagem granular propostas. O comportamento de fluxos de dados não-estacionários com mudanças graduais e abruptas de regime é também analisado dentro do paradigma de computação granular evolutiva. Realçamos o papel da computação intervalar, fuzzy e neuro-fuzzy em processar dados incertos e prover soluções aproximadas de alta qualidade e sumário de regras de conjuntos de dados de entrada e saída. As abordagens e o paradigma introduzidos constituem uma extensão natural de sistemas inteligentes evolutivos para processamento de dados numéricos a sistemas granulares evolutivos para processamento de dados granularesAbstract: In recent years there has been increasing interest in computational modeling approaches to deal with real-world data streams. Methods and algorithms have been proposed to uncover meaningful knowledge from very large (often unbounded) data sets in principle with no apparent value. This thesis introduces a framework for evolving granular modeling of uncertain data streams. Evolving granular systems comprise an array of online modeling approaches inspired by the way in which humans deal with complexity. These systems explore the information flow in dynamic environments and derive from it models that can be linguistically understood. Particularly, information granulation is a natural technique to dispense unnecessary details and emphasize transparency, interpretability and scalability of information systems. Uncertain (granular) data arise from imprecise perception or description of the value of a variable. Broadly stated, various factors can affect one's choice of data representation such that the representing object conveys the meaning of the concept it is being used to represent. Of particular concern to this work are numerical, interval, and fuzzy types of granular data; and interval, fuzzy, and neurofuzzy modeling frameworks. Learning in evolving granular systems is based on incremental algorithms that build model structure from scratch on a per-sample basis and adapt model parameters whenever necessary. This learning paradigm is meaningful once it avoids redesigning and retraining models all along if the system changes. Application examples in classification, function approximation, time-series prediction and control using real and synthetic data illustrate the usefulness of the granular approaches and framework proposed. The behavior of nonstationary data streams with gradual and abrupt regime shifts is also analyzed in the realm of evolving granular computing. We shed light upon the role of interval, fuzzy, and neurofuzzy computing in processing uncertain data and providing high-quality approximate solutions and rule summary of input-output data sets. The approaches and framework introduced constitute a natural extension of evolving intelligent systems over numeric data streams to evolving granular systems over granular data streamsDoutoradoAutomaçãoDoutor em Engenharia Elétric

    Evolutionary Granular Kernel Machines

    Get PDF
    Kernel machines such as Support Vector Machines (SVMs) have been widely used in various data mining applications with good generalization properties. Performance of SVMs for solving nonlinear problems is highly affected by kernel functions. The complexity of SVMs training is mainly related to the size of a training dataset. How to design a powerful kernel, how to speed up SVMs training and how to train SVMs with millions of examples are still challenging problems in the SVMs research. For these important problems, powerful and flexible kernel trees called Evolutionary Granular Kernel Trees (EGKTs) are designed to incorporate prior domain knowledge. Granular Kernel Tree Structure Evolving System (GKTSES) is developed to evolve the structures of Granular Kernel Trees (GKTs) without prior knowledge. A voting scheme is also proposed to reduce the prediction deviation of GKTSES. To speed up EGKTs optimization, a master-slave parallel model is implemented. To help SVMs challenge large-scale data mining, a Minimum Enclosing Ball (MEB) based data reduction method is presented, and a new MEB-SVM algorithm is designed. All these kernel methods are designed based on Granular Computing (GrC). In general, Evolutionary Granular Kernel Machines (EGKMs) are investigated to optimize kernels effectively, speed up training greatly and mine huge amounts of data efficiently

    A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

    Get PDF
    Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data
    corecore