55 research outputs found
A literature review on the application of evolutionary computing to credit scoring
The last years have seen the development of many credit scoring models for assessing the creditworthiness of loan applicants. Traditional credit scoring methodology has involved the use of statistical and mathematical programming techniques such as discriminant analysis, linear and logistic regression, linear and quadratic programming, or decision trees. However, the importance of credit grant decisions for financial institutions has caused growing interest in using a variety of computational intelligence techniques. This paper concentrates on evolutionary computing, which is viewed as one of the most promising paradigms of computational intelligence. Taking into account the synergistic relationship between the communities of Economics and Computer Science, the aim of this paper is to summarize the most recent developments in the application of evolutionary algorithms to credit scoring by means of a thorough review of scientific articles published during the period 2000–2012.This work has partially been supported by the Spanish Ministry of Education and Science under grant TIN2009-14205 and the Generalitat Valenciana under grant PROMETEO/2010/028
Quality-by-design approach for the development of lipid-based nanosystems for anti-mycobacterial therapy
In this work, we rationally developed a lipid-based nanotechnological
platform for hydrophobic anti-mycobacterial drugs. For this purpose, Artificial Intelligence tools were employed to
assist formulation development, from the initial design to its conversion into a solid dosage form. Reproducible
nanocarriers exhibiting suitable properties were achieved through a simple and robust procedure. Furthermore, the
analysis of their in vitro performance revealed promising results in terms of permeability, cell uptake and selective
intracellular release. Thus demonstrating the potential of these nanosystems to treat intestinal intracellular
infections, increasingly related with Crohn´s disease development
Adaptive Neural Fuzzy Inference System for Hydrogen Adsorption Prediction
This report is basically to discuss about the basic concept and implementation of
Artificial Neural Fuzzy Inference System (ANFIS) in predicting the hydrogen
adsorption isotherm. The objective of this project is to create an ANFIS that is able
to predict the hydrogen adsorption isotherm. The challenge in this project is to
develop the ANFIS that is able to predict the hydrogen adsorption isotherm at the
highest accuracy.
ANFIS is developed by using MatLab R2008a. This software which is a
mathematicalp ower tool has the ability to developt he ANFIS. This is becauseth e
software has the Fuzzy Logic Toolbox which is the basic requirement in building the
system.
The basic system is developed to receive two inputs which are temperature and
pressure from users and gives one output which is the hydrogen adsorption value.
Three membership functions are provided for each input which is then used in
determining the output of the system.
Multiple training data are given to the basic system in order to mature it. Upon
completion, the system is then tested with test data and output from the system is
analyzed.C alculationso f the output percentagee rror are carried out. From here,t he
ANFIS membership functions for each input are fine tuned in order to reduce output
percentagee rror in order to increaset he prediction accuracyo f the system.
As a result, the ANFIS is able to give prediction data with an error less than 5%
which desirable in this projec
A systematic review of the applications of Expert Systems (ES) and machine learning (ML) in clinical urology.
BackgroundTesting a hypothesis for 'factors-outcome effect' is a common quest, but standard statistical regression analysis tools are rendered ineffective by data contaminated with too many noisy variables. Expert Systems (ES) can provide an alternative methodology in analysing data to identify variables with the highest correlation to the outcome. By applying their effective machine learning (ML) abilities, significant research time and costs can be saved. The study aims to systematically review the applications of ES in urological research and their methodological models for effective multi-variate analysis. Their domains, development and validity will be identified.MethodsThe PRISMA methodology was applied to formulate an effective method for data gathering and analysis. This study search included seven most relevant information sources: WEB OF SCIENCE, EMBASE, BIOSIS CITATION INDEX, SCOPUS, PUBMED, Google Scholar and MEDLINE. Eligible articles were included if they applied one of the known ML models for a clear urological research question involving multivariate analysis. Only articles with pertinent research methods in ES models were included. The analysed data included the system model, applications, input/output variables, target user, validation, and outcomes. Both ML models and the variable analysis were comparatively reported for each system.ResultsThe search identified n = 1087 articles from all databases and n = 712 were eligible for examination against inclusion criteria. A total of 168 systems were finally included and systematically analysed demonstrating a recent increase in uptake of ES in academic urology in particular artificial neural networks with 31 systems. Most of the systems were applied in urological oncology (prostate cancer = 15, bladder cancer = 13) where diagnostic, prognostic and survival predictor markers were investigated. Due to the heterogeneity of models and their statistical tests, a meta-analysis was not feasible.ConclusionES utility offers an effective ML potential and their applications in research have demonstrated a valid model for multi-variate analysis. The complexity of their development can challenge their uptake in urological clinics whilst the limitation of the statistical tools in this domain has created a gap for further research studies. Integration of computer scientists in academic units has promoted the use of ES in clinical urological research
Adaptive Neural Fuzzy Inference System for Hydrogen Adsorption Prediction
This report is basically to discuss about the basic concept and implementation of
Artificial Neural Fuzzy Inference System (ANFIS) in predicting the hydrogen
adsorption isotherm. The objective of this project is to create an ANFIS that is able
to predict the hydrogen adsorption isotherm. The challenge in this project is to
develop the ANFIS that is able to predict the hydrogen adsorption isotherm at the
highest accuracy.
ANFIS is developed by using MatLab R2008a. This software which is a
mathematicalp ower tool has the ability to developt he ANFIS. This is becauseth e
software has the Fuzzy Logic Toolbox which is the basic requirement in building the
system.
The basic system is developed to receive two inputs which are temperature and
pressure from users and gives one output which is the hydrogen adsorption value.
Three membership functions are provided for each input which is then used in
determining the output of the system.
Multiple training data are given to the basic system in order to mature it. Upon
completion, the system is then tested with test data and output from the system is
analyzed.C alculationso f the output percentagee rror are carried out. From here,t he
ANFIS membership functions for each input are fine tuned in order to reduce output
percentagee rror in order to increaset he prediction accuracyo f the system.
As a result, the ANFIS is able to give prediction data with an error less than 5%
which desirable in this projec
Decision Support Systems for Risk Assessment in Credit Operations Against Collateral
With the global economic crisis, which reached its peak in the second half of 2008, and
before a market shaken by economic instability, financial institutions have taken steps to protect
the banks’ default risks, which had an impact directly in the form of analysis in credit institutions
to individuals and to corporate entities. To mitigate the risk of banks in credit operations, most
banks use a graded scale of customer risk, which determines the provision that banks must
do according to the default risk levels in each credit transaction. The credit analysis involves
the ability to make a credit decision inside a scenario of uncertainty and constant changes and
incomplete transformations. This ability depends on the capacity to logically analyze situations,
often complex and reach a clear conclusion, practical and practicable to implement.
Credit Scoring models are used to predict the probability of a customer proposing to
credit to become in default at any given time, based on his personal and financial information
that may influence the ability of the client to pay the debt. This estimated probability, called the
score, is an estimate of the risk of default of a customer in a given period. This increased concern
has been in no small part caused by the weaknesses of existing risk management techniques
that have been revealed by the recent financial crisis and the growing demand for consumer
credit.The constant change affects several banking sections because it prevents the ability to
investigate the data that is produced and stored in computers that are too often dependent on
manual techniques.
Among the many alternatives used in the world to balance this risk, the provision of
guarantees stands out of guarantees in the formalization of credit agreements. In theory, the
collateral does not ensure the credit return, as it is not computed as payment of the obligation
within the project. There is also the fact that it will only be successful if triggered, which involves
the legal area of the banking institution. The truth is, collateral is a mitigating element
of credit risk. Collaterals are divided into two types, an individual guarantee (sponsor) and the
asset guarantee (fiduciary). Both aim to increase security in credit operations, as an payment
alternative to the holder of credit provided to the lender, if possible, unable to meet its obligations
on time. For the creditor, it generates liquidity security from the receiving operation. The
measurement of credit recoverability is a system that evaluates the efficiency of the collateral
invested return mechanism.
In an attempt to identify the sufficiency of collateral in credit operations, this thesis
presents an assessment of smart classifiers that uses contextual information to assess whether
collaterals provide for the recovery of credit granted in the decision-making process before
the credit transaction become insolvent. The results observed when compared with other approaches
in the literature and the comparative analysis of the most relevant artificial intelligence
solutions, considering the classifiers that use guarantees as a parameter to calculate the
risk contribute to the advance of the state of the art advance, increasing the commitment to
the financial institutions.Com a crise econômica global, que atingiu seu auge no segundo semestre de 2008, e diante
de um mercado abalado pela instabilidade econômica, as instituições financeiras tomaram
medidas para proteger os riscos de inadimplência dos bancos, medidas que impactavam diretamente
na forma de análise nas instituições de crédito para pessoas físicas e jurídicas. Para
mitigar o risco dos bancos nas operações de crédito, a maioria destas instituições utiliza uma
escala graduada de risco do cliente, que determina a provisão que os bancos devem fazer de
acordo com os níveis de risco padrão em cada transação de crédito. A análise de crédito envolve
a capacidade de tomar uma decisão de crédito dentro de um cenário de incerteza e mudanças
constantes e transformações incompletas. Essa aptidão depende da capacidade de analisar situações
lógicas, geralmente complexas e de chegar a uma conclusão clara, prática e praticável
de implementar.
Os modelos de Credit Score são usados para prever a probabilidade de um cliente
propor crédito e tornar-se inadimplente a qualquer momento, com base em suas informações
pessoais e financeiras que podem influenciar a capacidade do cliente de pagar a dívida. Essa
probabilidade estimada, denominada pontuação, é uma estimativa do risco de inadimplência de
um cliente em um determinado período. A mudança constante afeta várias seções bancárias,
pois impede a capacidade de investigar os dados que são produzidos e armazenados em computadores
que frequentemente dependem de técnicas manuais.
Entre as inúmeras alternativas utilizadas no mundo para equilibrar esse risco, destacase
o aporte de garantias na formalização dos contratos de crédito. Em tese, a garantia não
“garante” o retorno do crédito, já que não é computada como pagamento da obrigação dentro do
projeto. Tem-se ainda, o fato de que esta só terá algum êxito se acionada, o que envolve a área
jurídica da instituição bancária. A verdade é que, a garantia é um elemento mitigador do risco
de crédito. As garantias são divididas em dois tipos, uma garantia individual (patrocinadora) e
a garantia do ativo (fiduciário). Ambos visam aumentar a segurança nas operações de crédito,
como uma alternativa de pagamento ao titular do crédito fornecido ao credor, se possível, não
puder cumprir suas obrigações no prazo. Para o credor, gera segurança de liquidez a partir da
operação de recebimento. A mensuração da recuperabilidade do crédito é uma sistemática que
avalia a eficiência do mecanismo de retorno do capital investido em garantias.
Para tentar identificar a suficiência das garantias nas operações de crédito, esta tese
apresenta uma avaliação dos classificadores inteligentes que utiliza informações contextuais
para avaliar se as garantias permitem prever a recuperação de crédito concedido no processo de
tomada de decisão antes que a operação de crédito entre em default. Os resultados observados
quando comparados com outras abordagens existentes na literatura e a análise comparativa das
soluções de inteligência artificial mais relevantes, mostram que os classificadores que usam
garantias como parâmetro para calcular o risco contribuem para o avanço do estado da arte,
aumentando o comprometimento com as instituições financeiras
Credit Scoring Using Machine Learning
For financial institutions and the economy at large, the role of credit scoring in lending decisions cannot be overemphasised. An accurate and well-performing credit scorecard allows lenders to control their risk exposure through the selective allocation of credit based on the statistical analysis of historical customer data. This thesis identifies and investigates a number of specific challenges that occur during the development of credit scorecards. Four main contributions are made in this thesis. First, we examine the performance of a number supervised classification techniques on a collection of imbalanced credit scoring datasets. Class imbalance occurs when there are significantly fewer examples in one or more classes in a dataset compared to the remaining classes. We demonstrate that oversampling the minority class leads to no overall improvement to the best performing classifiers. We find that, in contrast, adjusting the threshold on classifier output yields, in many cases, an improvement in classification performance. Our second contribution investigates a particularly severe form of class imbalance, which, in credit scoring, is referred to as the low-default portfolio problem. To address this issue, we compare the performance of a number of semi-supervised classification algorithms with that of logistic regression. Based on the detailed comparison of classifier performance, we conclude that both approaches merit consideration when dealing with low-default portfolios. Third, we quantify the differences in classifier performance arising from various implementations of a real-world behavioural scoring dataset. Due to commercial sensitivities surrounding the use of behavioural scoring data, very few empirical studies which directly address this topic are published. This thesis describes the quantitative comparison of a range of dataset parameters impacting classification performance, including: (i) varying durations of historical customer behaviour for model training; (ii) different lengths of time from which a borrower’s class label is defined; and (iii) using alternative approaches to define a customer’s default status in behavioural scoring. Finally, this thesis demonstrates how artificial data may be used to overcome the difficulties associated with obtaining and using real-world data. The limitations of artificial data, in terms of its usefulness in evaluating classification performance, are also highlighted. In this work, we are interested in generating artificial data, for credit scoring, in the absence of any available real-world data
Prescription Fraud detection via data mining : a methodology proposal
Ankara : The Department of Industrial Engineering and the Institute of Engineering and Science of Bilkent University, 2009.Thesis (Master's) -- -Bilkent University, 2009.Includes bibliographical references leaves 61-69Fraud is the illegitimate act of violating regulations in order to gain personal profit.
These kinds of violations are seen in many important areas including, healthcare, computer
networks, credit card transactions and communications. Every year health care fraud causes
considerable amount of losses to Social Security Agencies and Insurance Companies in many
countries including Turkey and USA. This kind of crime is often seem victimless by the
committers, nonetheless the fraudulent chain between pharmaceutical companies, health care
providers, patients and pharmacies not only damage the health care system with the financial
burden but also greatly hinders the health care system to provide legitimate patients with
quality health care. One of the biggest issues related with health care fraud is the prescription
fraud. This thesis aims to identify a data mining methodology in order to detect fraudulent
prescriptions in a large prescription database, which is a task traditionally conducted by
human experts. For this purpose, we have developed a customized data-mining model for the
prescription fraud detection. We employ data mining methodologies for assigning a risk score
to prescriptions regarding Prescribed Medicament- Diagnosis consistency, Prescribed
Medicaments’ consistency within a prescription, Prescribed Medicament- Age and Sex
consistency and Diagnosis- Cost consistency. Our proposed model has been tested on real
world data. The results we obtained from our experimentations reveal that the proposed model
works considerably well for the prescription fraud detection problem with a 77.4% true
positive rate. We conclude that incorporating such a system in Social Security Agencies
would radically decrease human-expert auditing costs and efficiency.Aral, Karca DuruM.S
Rethinking construction cost overruns: an artificial neural network approach to construction cost estimation
The main concern of a construction client is to procure a facility that is able to
meet its functional requirements, of the required quality, and delivered within an
acceptable budget and timeframe. The cost aspect of these key performance
indicators usually ranks highest. In spite of the importance of cost estimation, it is
undeniably neither simple nor straightforward because of the lack of information
in the early stages of the project. Construction projects therefore have routinely
overrun their estimates.
Cost overrun has been attributed to a number of sources including technical error
in design, managerial incompetence, risk and uncertainty, suspicions of foul play
and even corruption. Furthermore, even though it is accepted that factors such as
tendering method, location of project, procurement method or size of project
have an effect on likely final cost of a project, it is difficult to establish their
measured financial impact. Estimators thus have to rely largely on experience and
intuition when preparing initial estimates, often neglecting most of these factors
in the final cost build-up. The decision-to-build for most projects is therefore
largely based on unrealistic estimates that would inevitably be exceeded.
The main aim of this research is to re-examine the sources of cost overrun on
construction projects and to develop final cost estimation models that could help
in reaching more reliable final cost estimates at the tendering stage of the project.
The research identified two predominant schools of thought on the sources of
overruns – referred to here as the PsychoStrategists and Evolution Theorists.
Another finding was that there is no unanimity on the reference point from which
cost performance could be assessed, leading to a large disparity in the size of
overruns reported. Another misunderstanding relates to the term “cost overrun”
itself.
The experimental part of the research, conducted in collaboration with two
industry partners, used a combination of non-parametric bootstrapping and
ensemble modelling with artificial neural networks to develop final project cost
models based on about 1,600 water infrastructure projects. 92% of the validation
predictions were within ±10% of the actual final cost of the project. The models
will be particularly useful at the pre-contract stage as they will provide a
benchmark for evaluating submitted tenders and also allow the quick generation
of various alternative solutions for a construction project using what-if scenarios.
The original contribution of the study is a fresh thinking of construction “cost
overruns”, now proposed to be more appropriately known as “cost growth” based
on a synthesises of the two schools of thought into a conceptual model. The
second contribution is the development of novel models of construction cost
estimation utilising artificial neural networks coupled with bootstrapping and
ensemble modelling
- …