Search CORE

314,256 research outputs found

Clicking into Mortgage Arrears: A Study into Arrears Prediction with Clickstream Data

Author: O\u27Brien Gavin
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2018
Field of study

This research project investigates the predictive capability of clickstream data when used for the purpose of mortgage arrears prediction. With an ever growing number of people switching to digital channels to handle their daily banking requirements, there is a wealth of ever increasing online usage data, otherwise known as clickstream data. If leveraged correctly, this clickstream data can be a powerful data source for organisations as it provides detailed information about how their customers are interacting with their digital channels. Much of the current literature associated with clickstream data relates to organisations employing it within their customer relationship management mechanisms to build better relationships with their customers. There has been little investigation into the use of clickstream data in credit scoring or arrears prediction. Since the financial meltdown of 2008, financial institutions have being obliged to have mechanisms in place to deal with mortgage accounts which are in arrears or have a risk of entering arrears. A potentially crucial step in this process is the ability of an institution to accurately predict which of their mortgage accounts may enter arrears. In addition to traditional demographical and transactional data, this research determines the impact clickstream data can have on an arrears prediction model. A multitude of binary classifiers were reviewed in this arrears prediction problem. Of these classifiers, ensembles models proved to be the highest performing models achieving reasonably high recall accuracies without the inclusion of clickstream data. Once clickstream data was added to the models, it led to marginal increases in accuracy, which was a positive result

Arrow@TUDublin

Managing Dynamic Enterprise and Urgent Workloads on Clouds Using Layered Queuing and Historical Performance Models

Author: Bacigalupo David A.
Chen Xiaoyu
Chester Adam P.
Dillenberger Donna N.
Gilbert Lester
He Ligang
Jarvis Stephen A.
Usmani Asif
van Hemert Jano
Wills Gary
Publication venue
Publication date: 01/01/2011
Field of study

The automatic allocation of enterprise workload to resources can be enhanced by being able to make what-if response time predictions whilst different allocations are being considered. We experimentally investigate an historical and a layered queuing performance model and show how they can provide a good level of support for a dynamic-urgent cloud environment. Using this we define, implement and experimentally investigate the effectiveness of a prediction-based cloud workload and resource management algorithm. Based on these experimental analyses we: i.) comparatively evaluate the layered queuing and historical techniques; ii.) evaluate the effectiveness of the management algorithm in different operating scenarios; and iii.) provide guidance on using prediction-based workload and resource management

Southampton (e-Prints Soton)

University of Birmingham Research Portal

Warwick Research Archives Portal Repository

Recommended from our members

Predicting with sparse data

Author: Cartwright MH
Shepperd MJ
Publication venue: IEEE Computer Society
Publication date: 01/01/2001
Field of study

It is well known that effective prediction of project cost related factors is an important aspect of software engineering. Unfortunately, despite extensive research over more than 30 years, this remains a significant problem for many practitioners. A major obstacle is the absence of reliable and systematic historic data, yet this is a sine qua non for almost all proposed methods: statistical, machine learning or calibration of existing models. In this paper we describe our sparse data method (SDM) based upon a pairwise comparison technique and Saaty's Analytic Hierarchy Process (AHP). Our minimum data requirement is a single known point. The technique is supported by a software tool known as DataSalvage. We show, for data from two companies, how our approach — based upon expert judgement — adds value to expert judgement by producing significantly more accurate and less biased results. A sensitivity analysis shows that our approach is robust to pairwise comparison errors. We then describe the results of a small usability trial with a practising project manager. From this empirical work we conclude that the technique is promising and may help overcome some of the present barriers to effective project prediction

Brunel University Research Archive

Recommended from our members

A systematic review of software development cost estimation studies

Author: Jørgensen M
Shepperd MJ
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

This paper aims to provide a basis for the improvement of software estimation research through a systematic review of previous work. The review identifies 304 software cost estimation papers in 76 journals and classifies the papers according to research topic, estimation approach, research approach, study context and data set. A web-based library of these cost estimation papers is provided to ease the identification of relevant estimation research results. The review results combined with other knowledge provide support for recommendations for future software cost estimation research, including: 1) Increase the breadth of the search for relevant studies, 2) Search manually for relevant papers within a carefully selected set of journals when completeness is essential, 3) Conduct more studies on estimation methods commonly used by the software industry, and, 4) Increase the awareness of how properties of the data sets impact the results when evaluating estimation methods

Brunel University Research Archive

The consistency of empirical comparisons of regression and analogy-based software project cost prediction

Author: Mair CM
Shepperd M J
Publication venue: IEEE Computer Society
Publication date: 01/01/2005
Field of study

OBJECTIVE - to determine the consistency within and between results in empirical studies of software engineering cost estimation. We focus on regression and analogy techniques as these are commonly used. METHOD – we conducted an exhaustive search using predefined inclusion and exclusion criteria and identified 67 journal papers and 104 conference papers. From this sample we identified 11 journal papers and 9 conference papers that used both methods. RESULTS – our analysis found that about 25% of studies were internally inconclusive. We also found that there is approximately equal evidence in favour of, and against analogy-based methods. CONCLUSIONS – we confirm the lack of consistency in the findings and argue that this inconsistent pattern from 20 different studies comparing regression and analogy is somewhat disturbing. It suggests that we need to ask more detailed questions than just: “What is the best prediction system?

Brunel University Research Archive

Reliability and validity in comparative studies of software prediction models

Author: Myrtveit I
Shepperd MJ
Stensrud E
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2005
Field of study

Empirical studies on software prediction models do not converge with respect to the question "which prediction model is best?" The reason for this lack of convergence is poorly understood. In this simulation study, we have examined a frequently used research procedure comprising three main ingredients: a single data sample, an accuracy indicator, and cross validation. Typically, these empirical studies compare a machine learning model with a regression model. In our study, we use simulation and compare a machine learning and a regression model. The results suggest that it is the research procedure itself that is unreliable. This lack of reliability may strongly contribute to the lack of convergence. Our findings thus cast some doubt on the conclusions of any study of competing software prediction models that used this research procedure as a basis of model comparison. Thus, we need to develop more reliable research procedures before we can have confidence in the conclusions of comparative studies of software prediction models

Crossref

Brunel University Research Archive

Time Series Analysis of Pavement Roughness Condition Data for use in Asset Management

Author: Bunker Jonathan
Hunt Phillip
Publication venue: 'Queensland University of Technology'
Publication date: 01/01/2002
Field of study

Roughness is a direct measure of the unevenness of a longitudinal section of road pavement. Increased roughness corresponds to decreased ride comfort and increased road user costs. Roughness is relatively inexpensive to measure. Measuring roughness progression over time enables pavement deterioration, which is the result of a complex and chaotic system of environmental and road management influences, to be monitored. This in turn enables the long term functional behaviour of a pavement network to be understood and managed. A range of approaches has been used to model roughness progression for assistance in pavement asset management. The type of modelling able to be undertaken by road agencies depends upon the frequency and extent of data collection, which are consequences of funding available. The aims of this study are to increase the understanding of unbound granular pavement performance by investigating roughness progression, and to model roughness progression to improve roughness prediction methods. The pavement management system in place within the project partner road agency and the data available to this study lend themselves to a methodology allowing roughness progression to be investigated using financial maintenance and physical condition information available for each 1km pavement segment in a 16,000km road network

Queensland University of Technology ePrints Archive

Investigating effort prediction of web-based applications using CBR on the ISBSG dataset

Author: Letchmunan S.
Roper M.
Wood M.
Publication venue
Publication date: 01/04/2010
Field of study

As web-based applications become more popular and more sophisticated, so does the requirement for early accurate estimates of the effort required to build such systems. Case-based reasoning (CBR) has been shown to be a reasonably effective estimation strategy, although it has not been widely explored in the context of web applications. This paper reports on a study carried out on a subset of the ISBSG dataset to examine the optimal number of analogies that should be used in making a prediction. The results show that it is not possible to select such a value with confidence, and that, in common with other findings in different domains, the effectiveness of CBR is hampered by other factors including the characteristics of the underlying dataset (such as the spread of data and presence of outliers) and the calculation employed to evaluate the distance function (in particular, the treatment of numeric and categorical data)

Crossref

University of Strathclyde Institutional Repository