5,106 research outputs found
Comparing software prediction techniques using simulation
The need for accurate software prediction systems increases as software becomes much larger and more complex. We believe that the underlying characteristics: size, number of features, type of distribution, etc., of the data set influence the choice of the prediction system to be used. For this reason, we would like to control the characteristics of such data sets in order to systematically explore the relationship between accuracy, choice of prediction system, and data set characteristic. It would also be useful to have a large validation data set. Our solution is to simulate data allowing both control and the possibility of large (1000) validation cases. The authors compare four prediction techniques: regression, rule induction, nearest neighbor (a form of case-based reasoning), and neural nets. The results suggest that there are significant differences depending upon the characteristics of the data set. Consequently, researchers should consider prediction context when evaluating competing prediction systems. We observed that the more "messy" the data and the more complex the relationship with the dependent variable, the more variability in the results. In the more complex cases, we observed significantly different results depending upon the particular training set that has been sampled from the underlying data set. However, our most important result is that it is more fruitful to ask which is the best prediction system in a particular context rather than which is the "best" prediction system
A Neural-CBR System for Real Property Valuation
In recent times, the application of artificial intelligence (AI) techniques for real property valuation has been on the
increase. Some expert systems that leveraged on machine intelligence concepts include rule-based reasoning, case-based
reasoning and artificial neural networks. These approaches have proved reliable thus far and in certain cases outperformed
the use of statistical predictive models such as hedonic regression, logistic regression, and discriminant analysis. However,
individual artificial intelligence approaches have their inherent limitations. These limitations hamper the quality of
decision support they proffer when used alone for real property valuation. In this paper, we present a Neural-CBR system
for real property valuation, which is based on a hybrid architecture that combines Artificial Neural Networks and Case-
Based Reasoning techniques. An evaluation of the system was conducted and the experimental results revealed that the
system has higher satisfactory level of performance when compared with individual Artificial Neural Network and Case-
Based Reasoning systems
Constructed wetlands: Prediction of performance with case-based reasoning (part B)
The aim of this research was to assess the treatment efficiencies for gully pot liquor of experimental vertical-
flow constructed wetland filters containing Phragmites australis (Cav.) Trin. ex Steud. (common reed)
and filter media of different adsorption capacities. Six out of 12 filters received inflow water spiked with
metals. For 2 years, hydrated nickel and copper nitrate were added to sieved gully pot liquor to simulate
contaminated primary treated storm runoff. The findings were analyzed and discussed in a previous paper
(Part A). Case-based reasoning (CBR) methods were applied to predict 5 days at 20°C N-Allylthiourea biochemical
oxygen demand (BOD) and suspended solids (SS), and to demonstrate an alternative method of
analyzing water quality performance indicators. The CBR method was successful in predicting if outflow
concentrations were either above or below the thresholds set for water-quality variables. Relatively small
case bases of approximately 60 entries are sufficient to yield relatively high predictions of compliance of
at least 90% for BOD. Biochemical oxygen demand and SS are expensive to estimate, and can be cost-effectively
controlled by applying CBR with the input variables turbidity and conductivity
The consistency of empirical comparisons of regression and analogy-based software project cost prediction
OBJECTIVE - to determine the consistency within and between results in empirical studies of software engineering cost estimation. We focus on regression and analogy techniques as these are commonly used. METHOD – we conducted an exhaustive search using predefined inclusion and exclusion criteria and identified 67 journal papers and 104 conference papers. From this sample we identified 11 journal papers and 9 conference papers that used both methods. RESULTS – our analysis found that about 25% of studies were internally inconclusive. We also found that there is approximately equal evidence in favour of, and against analogy-based methods. CONCLUSIONS – we confirm the lack of consistency in the findings and argue that this inconsistent pattern from 20 different studies comparing regression and analogy is somewhat disturbing. It suggests that we need to ask more detailed questions than just: “What is the best prediction system?
The Effectiveness of Case-Based Reasoning: An Application in Sales Promotions
This paper deals with Case-based Reasoning (CBR) as a support technology for sales promotion (SP) decisions. CBR-systems try to mimic analogical reasoning, a form of human reasoning that is likely to occur in weakly-structured problem solving, such as the design of sales promotions. In an empirical study, we find evidence that use of the CBR-system improves the quality of SP-campaign proposals. In terms of the creativity of the proposals, decision-makers who think highly divergent (i.e., who tend to generate many, and diverse ideas in response to a problem) benefit most from prolonged system usage. Creativity, in turn, is positively related to the (practical) usability of a proposal. These results suggest that the CBR-system is most effective when it is used as an idea-generation tool that reinforces the strength of divergent (creative) thinkers. A convergent thinking style, in which case the CBR-system has a compensating role, even has a negative impact on CBR-system usage. Increasing the decision-maker's personal belief in the usefulness of the system, e.g., by training or education, may help to alleviate this reluctance to use the CBR-system.marketing management support systems;sales promotions;case-based reasoning;weakly-structured decision making
Making inferences with small numbers of training sets
A potential methodological problem with empirical studies that assess project effort prediction system is discussed. Frequently, a hold-out strategy is deployed so that the data set is split into a training and a validation set. Inferences are then made concerning the relative accuracy of the different prediction techniques under examination. This is typically done on very small numbers of sampled training sets. It is shown that such studies can lead to almost random results (particularly where relatively small effects are being studied). To illustrate this problem, two data sets are analysed using a configuration problem for case-based prediction and results generated from 100 training sets. This enables results to be produced with quantified confidence limits. From this it is concluded that in both cases using less than five training sets leads to untrustworthy results, and ideally more than 20 sets should be deployed. Unfortunately, this raises a question over a number of empirical validations of prediction techniques, and so it is suggested that further research is needed as a matter of urgency
Search Heuristics, Case-Based Reasoning and Software Project Effort Prediction
This paper reports on the use of search techniques to help optimise a case-based reasoning (CBR) system for predicting software project effort. A major problem, common to ML techniques in general, has been dealing with large numbers of case features, some of which can hinder the prediction process. Unfortunately searching for the optimal feature subset is a combinatorial problem and therefore NP-hard. This paper examines the use of random searching, hill climbing and forward sequential selection (FSS) to tackle this problem. Results from examining a set of real software project data show that even random searching was better than using all available for features (average error 35.6% rather than 50.8%). Hill climbing and FSS both produced results substantially better than the random search (15.3 and 13.1% respectively), but FSS was more computationally efficient. Providing a description of the fitness landscape of a problem along with search results is a step towards the classification of search problems and their assignment to optimum search techniques. This paper attempts to describe the fitness landscape of this problem by combining the results from random searches and hill climbing, as well as using multi-dimensional scaling to aid visualisation. Amongst other findings, the visualisation results suggest that some form of heuristic-based initialisation might prove useful for this problem
Science and Technology Cooperation in Cross-border Regions::A Proximity Approach with Evidence for Northern Europe
Given the sheer number of cross-border regions (CBRs) within the EU, their socio-economic importance has been recognized both by policy-makers and academics. Recently, the novel concept of cross-border regional innovation system has been introduced to guide the assessment of integration processes in CBRs. A central focus of this concept is set on analyzing the impact of varying types of proximity (cognitive, technological, etc.) on cross-border cooperation. Previous empirical applications of the concept have, however, relied on individual case studies and varying methodologies, thus complicating and constraining comparisons between different CBRs. Here a broader view is provided by comparing 28 Northern European CBRs. The empirical analysis utilizes economic, science and technology (S&T) statistics to construct proximity indicators and measures S&T integration in the context of cross-border cooperation. The findings from descriptive statistics and exploratory count data regressions show that technological and cognitive proximity measures are significantly related to S&T cooperation activities (cross-border co-publications and co-patents). Taken together, our empirical approach underlines the feasibility of utilizing the proximity approach for comparative analyses in CBR settings
- …