314,256 research outputs found

    Clicking into Mortgage Arrears: A Study into Arrears Prediction with Clickstream Data

    Get PDF
    This research project investigates the predictive capability of clickstream data when used for the purpose of mortgage arrears prediction. With an ever growing number of people switching to digital channels to handle their daily banking requirements, there is a wealth of ever increasing online usage data, otherwise known as clickstream data. If leveraged correctly, this clickstream data can be a powerful data source for organisations as it provides detailed information about how their customers are interacting with their digital channels. Much of the current literature associated with clickstream data relates to organisations employing it within their customer relationship management mechanisms to build better relationships with their customers. There has been little investigation into the use of clickstream data in credit scoring or arrears prediction. Since the financial meltdown of 2008, financial institutions have being obliged to have mechanisms in place to deal with mortgage accounts which are in arrears or have a risk of entering arrears. A potentially crucial step in this process is the ability of an institution to accurately predict which of their mortgage accounts may enter arrears. In addition to traditional demographical and transactional data, this research determines the impact clickstream data can have on an arrears prediction model. A multitude of binary classifiers were reviewed in this arrears prediction problem. Of these classifiers, ensembles models proved to be the highest performing models achieving reasonably high recall accuracies without the inclusion of clickstream data. Once clickstream data was added to the models, it led to marginal increases in accuracy, which was a positive result

    Managing Dynamic Enterprise and Urgent Workloads on Clouds Using Layered Queuing and Historical Performance Models

    No full text
    The automatic allocation of enterprise workload to resources can be enhanced by being able to make what-if response time predictions whilst different allocations are being considered. We experimentally investigate an historical and a layered queuing performance model and show how they can provide a good level of support for a dynamic-urgent cloud environment. Using this we define, implement and experimentally investigate the effectiveness of a prediction-based cloud workload and resource management algorithm. Based on these experimental analyses we: i.) comparatively evaluate the layered queuing and historical techniques; ii.) evaluate the effectiveness of the management algorithm in different operating scenarios; and iii.) provide guidance on using prediction-based workload and resource management

    The consistency of empirical comparisons of regression and analogy-based software project cost prediction

    Get PDF
    OBJECTIVE - to determine the consistency within and between results in empirical studies of software engineering cost estimation. We focus on regression and analogy techniques as these are commonly used. METHOD – we conducted an exhaustive search using predefined inclusion and exclusion criteria and identified 67 journal papers and 104 conference papers. From this sample we identified 11 journal papers and 9 conference papers that used both methods. RESULTS – our analysis found that about 25% of studies were internally inconclusive. We also found that there is approximately equal evidence in favour of, and against analogy-based methods. CONCLUSIONS – we confirm the lack of consistency in the findings and argue that this inconsistent pattern from 20 different studies comparing regression and analogy is somewhat disturbing. It suggests that we need to ask more detailed questions than just: “What is the best prediction system?

    Reliability and validity in comparative studies of software prediction models

    Get PDF
    Empirical studies on software prediction models do not converge with respect to the question "which prediction model is best?" The reason for this lack of convergence is poorly understood. In this simulation study, we have examined a frequently used research procedure comprising three main ingredients: a single data sample, an accuracy indicator, and cross validation. Typically, these empirical studies compare a machine learning model with a regression model. In our study, we use simulation and compare a machine learning and a regression model. The results suggest that it is the research procedure itself that is unreliable. This lack of reliability may strongly contribute to the lack of convergence. Our findings thus cast some doubt on the conclusions of any study of competing software prediction models that used this research procedure as a basis of model comparison. Thus, we need to develop more reliable research procedures before we can have confidence in the conclusions of comparative studies of software prediction models

    Time Series Analysis of Pavement Roughness Condition Data for use in Asset Management

    Get PDF
    Roughness is a direct measure of the unevenness of a longitudinal section of road pavement. Increased roughness corresponds to decreased ride comfort and increased road user costs. Roughness is relatively inexpensive to measure. Measuring roughness progression over time enables pavement deterioration, which is the result of a complex and chaotic system of environmental and road management influences, to be monitored. This in turn enables the long term functional behaviour of a pavement network to be understood and managed. A range of approaches has been used to model roughness progression for assistance in pavement asset management. The type of modelling able to be undertaken by road agencies depends upon the frequency and extent of data collection, which are consequences of funding available. The aims of this study are to increase the understanding of unbound granular pavement performance by investigating roughness progression, and to model roughness progression to improve roughness prediction methods. The pavement management system in place within the project partner road agency and the data available to this study lend themselves to a methodology allowing roughness progression to be investigated using financial maintenance and physical condition information available for each 1km pavement segment in a 16,000km road network

    Investigating effort prediction of web-based applications using CBR on the ISBSG dataset

    Get PDF
    As web-based applications become more popular and more sophisticated, so does the requirement for early accurate estimates of the effort required to build such systems. Case-based reasoning (CBR) has been shown to be a reasonably effective estimation strategy, although it has not been widely explored in the context of web applications. This paper reports on a study carried out on a subset of the ISBSG dataset to examine the optimal number of analogies that should be used in making a prediction. The results show that it is not possible to select such a value with confidence, and that, in common with other findings in different domains, the effectiveness of CBR is hampered by other factors including the characteristics of the underlying dataset (such as the spread of data and presence of outliers) and the calculation employed to evaluate the distance function (in particular, the treatment of numeric and categorical data)
    corecore