59,423 research outputs found
Recommended from our members
An investigation into establishing a generalised approach for defining similarity metrics between 3D shapes for case-based reasoning (CBR)
This thesis investigates the feasibility of establishing a generalised approach for defining similarity metrics between 3D shapes for the casting design problem in Case-Based Reasoning (CBR).
This research investigates a new approach for improving the quality of casting design advice achieved from a CBR system using casting design knowledge associated with past cases. The new approach uses enhanced similarity metrics to those used in previous research in this area to achieve improvements in the advice given. The new similarity metrics proposed here are based on the decomposition of casting shape cases into a set of components. The research into metrics defines and uses the Component Type Similarity Metric (CTM) and Maximum Common Subgraph (MCS) metric between graph representations of the case shapes and are focused on the definition of partial similarity between the components of the same type that take into account the geometrical features and proportions of each single shape component. Additionally, the investigation extends the scope of the research to 3D shapes by defining and evaluating a new metric for the overall similarity between 3D shapes. Additionally, this research investigates a methodology for the integration of the CBR cycle and automation of the feature extraction from target and source case shapes.
The ShapeCBR system has been developed to demonstrate the feasibility of integrating the CBR approach for retrieving and reusing casting design advice. The ShapeCBR system automates the decomposition process, the classification process and the shape matching process and is used to evaluate the new similarity metrics proposed in this research and the extension of the approach to 3D shapes.
Evaluation of the new similarity metrics show that the efficiency of the system is enhanced using the new similarity metrics and that the new approach provides useful casting design information for 3D casting shapes. Additionally, ShapeCBR shows that it is possible to automate the decomposition and classification of components that allow a case shape to be represented in graph form and thus provide the basis for automating the overall CBR cycle.
The thesis concludes with new research questions that emerge from this research and an agenda for further work to be pursued in further research in the area
Diabetes Diagnosis by Case-Based Reasoning and Fuzzy Logic
In the medical field, expertsā knowledge is based on experience, theoretical knowledge and rules. Case-based reasoning is a problem-solving paradigm which is based on past experiences. For this purpose, a large number of decision support applications based on CBR have been developed. Cases retrieval is often considered as the most important step of case-based reasoning. In this article, we integrate fuzzy logic and data mining to improve the response time and the accuracy of the retrieval of similar cases. The proposed Fuzzy CBR is composed of two complementary parts; the part of classification by fuzzy decision tree realized by Fispro and the part of case-based reasoning realized by the platform JColibri. The use of fuzzy logic aims to reduce the complexity of calculating the degree of similarity that can exist between diabetic patients who require different monitoring plans. The results of the proposed approach are compared with earlier methods using accuracy as metrics. The experimental results indicate that the fuzzy decision tree is very effective in improving the accuracy for diabetes classification and hence improving the retrieval step of CBR reasoning
Study of similarity metrics for matching network-based personalised human activity recognition.
Personalised Human Activity Recognition (HAR) models trained using data from the target user (subject-dependent) have been shown to be superior to non personalised models that are trained on data from a general population (subject-independent). However, from a practical perspective, collecting sufficient training data from end users to create subject-dependent models is not feasible. We have previously introduced an approach based on Matching networks which has proved effective for training personalised HAR models while requiring very little data from the end user. Matching networks perform nearest-neighbour classification by reusing the class label of the most similar instances in a provided support set, which makes them very relevant to case-based reasoning. A key advantage of matching networks is that they use metric learning to produce feature embeddings or representations that maximise classification accuracy, given a chosen similarity metric. However, to the best of our knowledge, no study has been provided into the performance of different similarity metrics for matching networks. In this paper, we present a study of five different similarity metrics: Euclidean, Manhattan, Dot Product, Cosine and Jaccard, for personalised HAR. Our evaluation shows that substantial differences in performance are achieved using different metrics, with Cosine and Jaccard producing the best performance
Investigating Text Message Classification Using Case-based Reasoning
Text classification is the categorization of text into a predefined set of categories. Text classification is becoming increasingly important given the large volume of text stored electronically e.g. email, digital libraries and the World Wide Web (WWW). These documents represent a massive amount of information that can be accessed easily. To gain benefit from using this information requires organisation. One way of organising it automatically is to use text classification. A number of well known machine learning techniques have been used in text classification including NaĆÆve Bayes, Support Vector Machines and Decision Trees, and the less commonly used are k-Nearest Neighbour, Neural Networks and Genetic Algorithms. One aspect of text classification is general message classification, the ability to correctly classify text messages containing text of different lengths. There are many applications that would benefit from this. An example of such applications are, personal emailing filtering, filtering email into different categories of business and personal email and spam email and email routing, e.g. routing email for a helpdesk, so that the email reaches the correct person. This thesis presents an investigation of applying a Case based Reasoning (CBR) approach to general text message classification. Case-based Reasoning was chosen as it was found to perform well for a particular type of message classification, spam filtering. CBR was found to have certain advantages over other machine learning techniques such as NaĆÆve Bayes. It was able to handle the dynamic nature of spam better than other machine learning techniques and offered the ability for the training data to be easily updated continuously and to have new training data immediately available. The objective of this research is to extend previous work conducted on spam filtering to general message classification, which includes classifying short and long text messages into multiple categories. Short text message classification presents a particular challenge as the concept being learnt is weak. We investigated two types of similarity metrics used with CBR, feature based and featureless similarity metrics. We then compared CBR using both feature based and featureless similarity metrics with two well known machine learning techniques. NaĆÆve Bayes (NB) and Support Vector machine (SVM). These two machine learning techniques serve as base line classifiers as they seem to be currently the classifier of choice in the text classification domain. The results of this search show that CBR using a featureless similarity metric achieves better performance than CBR using a feature base similarity metric. The results also show that when using CBR with a feature based similarity metric the classification task required different feature types and different feature representations, depending on the domain. We also investigated whether a case-base editing technique developed for spam case-bases improve the performance over unedited case-bases on different text domains. We found that the case-base editing technique used for spam filtering performs well for email based case-bases but not for other text domains of either short or long text messages
Reasoning about Record Matching Rules
To accurately match records it is often necessary to utilize the semantics of the data. Functional dependencies (FDs) have proven useful in identifying tuples in a clean relation, based on the semantics of the data. For all the reasons that FDs and their inference are needed, it is also important to develop dependencies and their reasoning techniques for matching tuples from
unreliable
data sources. This paper investigates dependencies and their reasoning for record matching. (a) We introduce a class of
matching dependencies
(MDs) for specifying the semantics of data in unreliable relations, defined in terms of
similarity metrics
and a
dynamic semantics
. (b) We identify a special case of MDs, referred to as
relative candidate keys
(RCKs), to determine what attributes to compare and how to compare them when matching records across possibly different relations. (c) We propose a mechanism for inferring MDs, a departure from traditional implication analysis, such that when we cannot match records by comparing attributes that contain errors, we may still find matches by using other, more reliable attributes. (d) We provide an
O
(
n
2
) time algorithm for inferring MDs, and an effective algorithm for deducing a set of RCKs from MDs. (e) We experimentally verify that the algorithms help matching tools efficiently identify keys at compile time for matching, blocking or windowing, and that the techniques effectively improve both the quality and efficiency of various record matching methods.
</jats:p
Combining case based reasoning with neural networks
This paper presents a neural network based technique for mapping problem situations to problem solutions for Case-Based Reasoning (CBR) applications. Both neural networks and
CBR are instance-based learning techniques, although neural nets work with numerical data and CBR systems work with symbolic data. This paper discusses how the application scope of both paradigms could be enhanced by the use of hybrid concepts. To make the use of neural networks possible, the problem's situation and solution features are transformed into continuous features, using techniques similar to CBR's definition of similarity metrics. Radial Basis Function (RBF) neural nets are used to create a multivariable, continuous input-output mapping. As the mapping is continuous, this technique also provides generalisation between cases, replacing the domain specific
solution adaptation techniques required by conventional CBR. This continuous representation also allows, as in
fuzzy logic, an associated membership measure to be output with each symbolic feature, aiding the prioritisation of various possible solutions. A further advantage is that, as the RBF neurons are only active in a limited area of the input space, the solution can be accompanied by local estimates of accuracy, based on the sufficiency of the cases present in that area as well as the results measured during testing. We describe how the application of this technique could be of benefit to the real world problem of sales advisory systems, among others
Combining case based reasoning with neural networks
This paper presents a neural network based technique for mapping problem situations to problem solutions for Case-Based Reasoning (CBR) applications. Both neural networks and
CBR are instance-based learning techniques, although neural nets work with numerical data and CBR systems work with symbolic data. This paper discusses how the application scope of both paradigms could be enhanced by the use of hybrid concepts. To make the use of neural networks possible, the problem's situation and solution features are transformed into continuous features, using techniques similar to CBR's definition of similarity metrics. Radial Basis Function (RBF) neural nets are used to create a multivariable, continuous input-output mapping. As the mapping is continuous, this technique also provides generalisation between cases, replacing the domain specific
solution adaptation techniques required by conventional CBR. This continuous representation also allows, as in
fuzzy logic, an associated membership measure to be output with each symbolic feature, aiding the prioritisation of various possible solutions. A further advantage is that, as the RBF neurons are only active in a limited area of the input space, the solution can be accompanied by local estimates of accuracy, based on the sufficiency of the cases present in that area as well as the results measured during testing. We describe how the application of this technique could be of benefit to the real world problem of sales advisory systems, among others
- ā¦