3 research outputs found

    Perks of Being Lazy: Boosting retrieval performance

    No full text
    Case-Based Reasoning (CBR) is a lazy learning method and, being such, when a new query is made to a CBR system, the swiftness of its retrieval phase proves to be very important for the overall system performance. The availability of ubiquitous data today is an opportunity for CBR systems as it implies more cases to reason with. Nevertheless, this availability also introduces a challenge for the CBR retrieval since distance calculations become computationally expensive. A good example of a domain where the case base is subject to substantial growth over time is the health records of patients where a query is typically an incremental update to prior cases. To deal with the retrieval performance challenge in such domains where cases are sequentially related, we introduce a novel method which significantly reduces the number of cases assessed in the search of exact nearest neighbors (NNs). In particular, when distance measures are metrics, they satisfy the triangle inequality and our method leverages this property to use it as a cutoff in NN search. Specifically, the retrieval is conducted in a lazy manner where only the cases that are true NN candidates for a query are evaluated. We demonstrate how a considerable number of unnecessary distance calculations is avoided in synthetically built domains which exhibit different problem feature characteristics and different cluster diversity.This work has been partially funded by project Innobrain, COMRDI-151-0017 (RIS3CAT comunitats), and Feder funds. Mehmet Oğuz Mülâyim is a PhD Student of the doctoral program in Computer Science at the Universitat Autònoma de Barcelona.Peer reviewe

    CBR Meets Big Data: A Case Study of Large-Scale Adaptation Rule Generation

    No full text
    Abstract. Adaptation knowledge generation is a difficult problem for CBR. In previous work we developed ensembles of adaptation for regression (EAR), a family of methods for generating and applying ensembles of adaptation rules for case-based regression. EAR has been shown to provide good performance, but at the cost of high computational complexity. When efficiency problems re-sult from case base growth, a common CBR approach is to focus on case base maintenance, to compress the case base. This paper presents a case study of an alternative approach, harnessing big data methods, specifically MapReduce and locality sensitive hashing (LSH), to make the EAR approach feasible for large case bases without compression. Experimental results show that the new method, BEAR, substantially increases accuracy compared to a baseline big data k-NN method using LSH. BEAR’s accuracy is comparable to that of traditional k-NN without using LSH, while its processing time remains reasonable for a case base of millions of cases. We suggest that increased use of big data methods in CBR has the potential for a departure from compression-based case-base maintenance methods, with their concomitant solution quality penalty, to enable the benefits of full case bases at much larger scales
    corecore