Search CORE

6 research outputs found

Feature Selection and Generalisation for Retrieval of Textual Cases

Author: G. Sakkis
G. Salton
J. Jarmulak
M. Lenz
M. Lenz
S. Das
T. Mitchell
Publication venue: Proceeding of the 7-th European Conference on Case-Based Reasoning. Lecture Notes in Artificial Intelligence
Publication date: 01/01/2004
Field of study

Textual CBR systems solve problems by reusing experiences that are in textual form. Knowledge-rich comparison of textual cases remains an important challenge for these systems. However mapping text data into a structured case representation requires a signiﬁcant knowledge engineering effort. In this paper we look at automated acquisition of the case indexing vocabulary as a two step process involving feature selection followed by feature generalisation. Boosted decision stumps are employed as a means to select features that are predictive and relatively orthogonal. Association rule induction is employed to capture feature co-occurrence patterns. Generalised features are constructed by applying these rules. Essentially, rules preserve implicit semantic relationships between features and applying them has the desired effect of bringing together cases that would have otherwise been overlooked during case retrieval. Experiments with four textual data sets show signiﬁcant improvement in retrieval accuracy whenever gener¬alised features are used. The results further suggest that boosted decision stumps with generalised features to be a promising combination

Research at Sofia University

CiteSeerX

Crossref

Automatic case acquisition from texts for process-oriented case-based reasoning

Author: Ber Florence Le
Dufour-Lussier Valmi
Lieber Jean
Nauer Emmanuel
Publication venue: 'Elsevier BV'
Publication date: 20/12/2012
Field of study

This paper introduces a method for the automatic acquisition of a rich case representation from free text for process-oriented case-based reasoning. Case engineering is among the most complicated and costly tasks in implementing a case-based reasoning system. This is especially so for process-oriented case-based reasoning, where more expressive case representations are generally used and, in our opinion, actually required for satisfactory case adaptation. In this context, the ability to acquire cases automatically from procedural texts is a major step forward in order to reason on processes. We therefore detail a methodology that makes case acquisition from processes described as free text possible, with special attention given to assembly instruction texts. This methodology extends the techniques we used to extract actions from cooking recipes. We argue that techniques taken from natural language processing are required for this task, and that they give satisfactory results. An evaluation based on our implemented prototype extracting workflows from recipe texts is provided.Comment: Sous presse, publication pr\'evue en 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Textual case-based reasoning

Author: Ashley Kevin D.
Bruninghaus Stefanie
Weber Rosina O.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 15/08/2007
Field of study

The Knowledge Engineering Review, 20(3): pp. 255-260.This commentary provides a definition of textual case-based reasoning (TCBR) and surveys research contributions according to four research questions. We also describe how TCBR can be distinguished from text mining and information retrieval. We conclude with potential directions for TCBR research

Drexel Libraries E-Repository and Archives

Integrating selection-based aspect sentiment and preference knowledge for social recommender systems.

Author: Chen Yoke Yie
Lothian Robert
Wiratunga Nirmalie
Publication venue: 'Emerald'
Publication date: 13/12/2019
Field of study

Purpose: Recommender system approaches such as collaborative and content-based filtering rely on user ratings and product descriptions to recommend products. More recently, recommender system research has focussed on exploiting knowledge from user-generated content such as product reviews to enhance recommendation performance. The purpose of this paper is to show that the performance of a recommender system can be enhanced by integrating explicit knowledge extracted from product reviews with implicit knowledge extracted from analysis of consumer’s purchase behaviour. Design/methodology/approach: The authors introduce a sentiment and preference-guided strategy for product recommendation by integrating not only explicit, user-generated and sentiment-rich content but also implicit knowledge gleaned from users’ product purchase preferences. Integration of both of these knowledge sources helps to model sentiment over a set of product aspects. The authors show how established dimensionality reduction and feature weighting approaches from text classification can be adopted to weight and select an optimal subset of aspects for recommendation tasks. The authors compare the proposed approach against several baseline methods as well as the state-of-the-art better method, which recommends products that are superior to a query product. Findings: Evaluation results from seven different product categories show that aspect weighting and selection significantly improves state-of-the-art recommendation approaches. Research limitations/implications: The proposed approach recommends products by analysing user sentiment on product aspects. Therefore, the proposed approach can be used to develop recommender systems that can explain to users why a product is recommended. This is achieved by presenting an analysis of sentiment distribution over individual aspects that describe a given product. Originality/value: This paper describes a novel approach to integrate consumer purchase behaviour analysis and aspect-level sentiment analysis to enhance recommendation. In particular, the authors introduce the idea of aspect weighting and selection to help users identify better products. Furthermore, the authors demonstrate the practical benefits of this approach on a variety of product categories and compare the approach with the current state-of-the-art approaches

Open Access Institutional Repository at Robert Gordon University

Nottingham Trent Institutional Repository (IRep)

Identifying facts for TCBR

Author: Proctor Jason M.
Waldstein Ilya
Weber Rosina O.
Publication venue
Publication date: 17/12/2007
Field of study

Paper presented at The Sixth International Conference on Case-Based Reasoning, Chicago, IL.This paper explores a method to algorithmically distinguish case-specific facts from potentially reusable or adaptable elements of cases in a textual case-based reasoning (TCBR) system. In the legal domain, documents often contain casespecific facts mixed with case-neutral details of law, precedent, conclusions the attorneys reach by applying their interpretation of the law to the case facts, and other aspects of argumentation that attorneys could potentially apply to similar situations. The automated distinction of these two categories, namely facts and other elements, has the potential to improve quality of automated textual case acquisition. The goal is ultimately to distinguish case problem from solution. To separate fact from other elements, we use an information gain (IG) algorithm to identify words that serve as efficient markers of one or the other. We demonstrate that this technique can successfully distinguish case-specific fact paragraphs from others, and propose future work to overcome some of the limitations of this pilot project

Drexel Libraries E-Repository and Archives

Réutilisation d'entités nommées pour la réponse au courriel

Author: Danet Laurent
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2006
Field of study

La réponse automatique aux courriels est une solution envisagée pour faciliter le travail de certains services d’entreprises, tels que les services à la clientèle ou les relations avec des investisseurs, services confrontés à un grand nombre de courriels souvent répétitifs. Nous avons décidé d’adapter une approche de raisonnement à base de cas (CBR - Case-Based Reasoning) pour confronter ce problème. Cette approche vise à réutiliser des messages antérieurs pour répondre à de nouveaux courriels, en sélectionnant une réponse adéquate parmi les messages archivés et en l’adaptant pour la rendre pertinent par rapport au contexte de la nouvelle requête. L’objectif de nos travaux est de définir une démarche pour aider l’usager d’un système de réponse au courriel à réutiliser les entités nommées de courriels antécédents. Cependant, les entités nommées nécessitent une adaptation avant d’être réutilisées. Pour ce faire, nous effectuons deux tâches qui sont d’abord l’identification des portions modifiables du message antécédent et ensuite la sélection des portions qui seront adaptées pour construire la réponse à la requête. Les deux tâches nécessitent l’utilisation de connaissances. Notre problématique consiste à déterminer si les approches adaptatives, basées sur des techniques d’apprentissage automatique permettent d’acquérir des connaissances pour réutiliser efficacement des entités nommées. La première tâche d’identification des portions modifiables s’apparente à l’extraction d’information. Toutefois nous nous intéressons uniquement aux entités nommées et à leurs spécialisations. La seconde tâche, la sélection de portions à adapter, correspond à une catégorisation de textes dans laquelle nous utilisons la requête pour attribuer une classe à la réponse que nous devons construire. La classe nous indique quelles entités doivent être adaptées. ii Nous avons étudiés et comparées différentes approches pour les deux tâches. Ainsi, nous avons testés pour l’extraction, les approches manuelle et automatiques, de haut en bas (top-down) et de bas vers le haut (bottom-up) sur un corpus de courriels. Les résultats obtenus par l’approche manuelle sont excellents. Toutefois nous observons une dégradation pour les approches automatiques. Pour la catégorisation, Nous avons évalué différentes représentations des textes et des mots, l’utilisation de poids pour ces derniers, et l’impact d’une compression obtenue par l’utilisation de règles d’association. Les résultats obtenus sont généralement satisfaisants et nous indique que notre approche, composée des deux tâches décrites précédemment, pourrait s’appliquer à notre problème de réponse automatique aux courriels.An automatic e-mail response system is a solution for improving the operations of certain business services, like customers’ services or investor relations. Those services are dealing with a large volume requests coming through e-mail messages, most of them being repetitive. We have decided to explore a CBR approach (Case-Based Reasoning) for this problem. Such an approach makes use of antecedent messages to respond to new incoming e-mails. Requests coming from customers or investors are often redundant; we could select an adequate answer among the archived messages, and then adapt it to make it coherent with the actual context of the new message request. In this project, we address the re-use problem, but more specifically the identification of named entity and their specialized roles. These entities are portions of text strongly depend on the context of the antecedent message, and hence need some adaptation to be re-used. We divide the reuse process in two tasks which are: a) the identification of modifiable portions of an antecedent message; b) the selection of portions to be adapted to build the answer of the request. For first task, we make use of information extraction techniques. But we will concentrate our efforts uniquely on the extraction of named entities and their specializations. For second task we make use of text classification techniques to decide which portions are subject to adaptation. This decision is based on the context of the request, words which compose it. We used different approaches for the two tasks. We tested manual and automatics top-down and bottom-up extraction techniques on an e-mail corpus for the identification of iv modifiable portions extraction task. Manual approach gives us excellent results. But, we notice a degradation of performance for automatic extraction techniques. For the selection of portions to be adapted, we compared made use of association rules and various word representation. Association rules use permits to compress data without degrades results a lot. Globally, results are good and indicate that our approach, desrcibes before, could be applied to our problem

CorpusUL