323 research outputs found

    Formalisation and evaluation of focus theories for requirements elicitation dialogues in natural language

    Get PDF
    Requirements engineering is an important part of software engineering. It consists in defining the needs of users when building a new system. These needs may be functional, i.e., what service should the system be able to provide, as well as non-functional, i.e., under which constraints should the system operate. Errors in requirements may have disastrous effects in the rest of the software engineering process (Brooks 1995, p.199), since they would lead to the construction of a system of little interest to its users or would require expensive modifications to correct. Because requirements documents may be very large, errors are usually hard to detect manually. Computer support is therefore often beneficial for their analysis. This is made easier if requirements are expressed formally. However, this support must also be adapted to and be usable by people who are expressing their requirements. These people are usually not computer specialists and are not accustomed to use formal languages. It is therefore necessary to help them express their requirements. Numerous approaches, have been suggested as aids to the acquisition of requirements (Reubenstein 1990). Much less attention has been paid to the control of the dialogue taking place between the users and the system whilst using such frameworks (Bubenko et al. 1994). Frameworks for requirements acquisition are not normally accompanied by theories of the types of dialogue which they support. Our ability to develop sophisticated formal frameworks to analyse requirements makes this deficiency more acutely felt, since increases in formality are often accompanied by greater difficulty in understanding and using the frameworks (Robertson et al. 1989).Users write their requirements in more or less natural language. This is then translated into a formal language that can be interpreted by the elicitation module. This module works on the requirements and provide feedback. The translation process is then applied to convert feedback into more or less natural language. Different systems put different emphasis on the parts of that general architecture. Some are very good at natural language interpretation while others put more emphasis on analysing the requirements and providing feedback.Natural language approaches to requirements elicitation, put an emphasis on natural language interpretation (see section 1.2.1). In these approaches, users write their specifica¬ tion in a subset of natural language. The system then translates it into a formal notation. The main benefit provided by these approaches is the improvement in the ease of use of the system: natural language is the main means of communication for human beings and does not need to be learned. However, most of these approaches do not provide a dialogue well suited for the requirements elicitation process. Because they translate the natural lan¬ guage specification into a formal notation but do not provide guidance on how to write the specification in the first place, users are left in charge of writing correct requirements. If a mistake is made while writing the specification, it will simply be translated into the formal notation.In order to actively help users in the process of writing the requirements, the elicit¬ ation system must interact with them. The emphasis, here, is no longer on translating requirements, but on actively extracting them through a dialogue with users. This is useful, since the requirements elicitation process is complex, and offering guidance is a big help for users. Unfortunately, most of the approaches providing guidance expose their formal underlying frameworks directly to users (see section 1.2.2). In order to benefit from the guidance provided, users have to learn the idiosyncrasies of the system they use. The task of providing guidance is complicated by the fact that there are numerous ways of carrying out the requirements elicitation. Very little research has been done on how to organise best the elicitation process to provide effective guidance. An arbitrary choice could be made, but forcing users to adopt a predefined method is usually not possible as it would make the elicitation process very difficult to follow and understand. The system must therefore be able to adapt itself to various elicitation methods. On the other hand, it is necessary for the system to make choices in order to provide active guidance. A "least-commitment" strategy, such as asking users at every choice point what to do next, is not a useful approach (Ferguson et al. 1996).One way of offering guidance without restricting users too much is by communicating with them in natural language, and by using natural language constraints to inform the choices made by the system to select a guidance strategy. These constraints ensure that the system adopts a strategy that will guide users in a natural and understandable manner, by taking into account the current state of the dialogue. In other words, the system takes into account the current state of the specification to help users complete it, but the current state of the dialogue is the principal factor constraining what will be spoken about next. Using such an approach reduces some of the problems discussed above. The specification does not need to be immediately correct as it will be checked and reworked by the system. The formal framework is hidden from users but is still there to ensure the correctness of the specifications. Guidance is continuously offered through dialogue, which is influenced by but does not directly follow the steps of construction of the specification.The natural language constraints we use in this thesis are theories of dialogue coherence, called "focus" theories. They define what can be spoken about next in a dialogue based on what has already been discussed and the subject under discussion. The theories take into account what participants in a dialogue pay attention to and try to ensure that the rest of the dialogue is related to it. The systems tries to help its users define how a research group WWW site should look like. The way the dialogue evolves from discussing the research group, to discussing the site and its associated home page, to discussing the set of publication can quite easily be followed. The use of pronouns helps in making the text fell natural. It would have been difficult to achieve the same result without using focus rules.Other techniques for organising dialogues, such as those based on the intentions under¬ lying the dialogue (Cohen et al. 1990), would require the dialogue manager to know what the elicitation system is trying to achieve and what its plan is. For some elicitation systems, this knowledge may not be available. Similarly, techniques based on the content of the communications exchanged and how they relate, e.g., based on RST (Mann and Thompson 1987), usually require a lot of domain knowledge. They are therefore time-consumming to code. Focus theories require less information from the elicitation module while enabling the dialogue manager to structure the dialogue. However, in some cases, focus theories are not sufficient to organise a dialogue. We use a theory based on speech act (see section 3.4.1) and some ideas from Grice's work on conversation (see section 5.2.1) to deal with these cases. More generally, although we tried to minimise the impact of other theories to study in detail focus theories, it would be interesting to know whether and how we can integrate them with the work presented in this thesis. In particular, the notion of dialog act and its application to dialog grammar could be of interest. General frameworks developped to study various aspects of dialogue, including dialog acts and focus, have started to appear but work is still at an early stage (C-Star Consortium 1998; Allen and Core 1997).Organising a dialogue based on attention requires a lot of domain knowledge in order to know how things mentioned in the dialogue relate to each other. Therefore, the amount of knowledge engineering needed to build natural language applications is also an important issue. We have tried to limit the engineering difficulties by clearly separating the domain knowledge needed by our dialogue manager from its management capabilities, and by provid¬ ing a way of re-using the existing domain knowledge as far as possible. This is done by using rules which enable us to re-use part of the domain knowledge already used by the elicitation module.The contribution of this thesis is therefore the formalisation and evaluation of focus theories for requirements elicitation dialogues in natural language. The main questions we deal with are the following: • Which focus theories should we use? • What are the relations between the constraints imposed by the focus theories and the constraints inherent to the requirements elicitation process? • Does this approach improve the perceived quality of the dialogue between the elicita¬ tion tool and its users?A prototype system has been developed. This system mainly operates in the WWW site design domain. It has also been applied in other domains as an initial demonstration of the range of problems that can be tackled by our approach

    A constraint-based hypergraph partitioning approach to coreference resolution

    Get PDF
    The objectives of this thesis are focused on research in machine learning for coreference resolution. Coreference resolution is a natural language processing task that consists of determining the expressions in a discourse that mention or refer to the same entity. The main contributions of this thesis are (i) a new approach to coreference resolution based on constraint satisfaction, using a hypergraph to represent the problem and solving it by relaxation labeling; and (ii) research towards improving coreference resolution performance using world knowledge extracted from Wikipedia. The developed approach is able to use entity-mention classi cation model with more expressiveness than the pair-based ones, and overcome the weaknesses of previous approaches in the state of the art such as linking contradictions, classi cations without context and lack of information evaluating pairs. Furthermore, the approach allows the incorporation of new information by adding constraints, and a research has been done in order to use world knowledge to improve performances. RelaxCor, the implementation of the approach, achieved results in the state of the art, and participated in international competitions: SemEval-2010 and CoNLL-2011. RelaxCor achieved second position in CoNLL-2011.La resolució de correferències és una tasca de processament del llenguatge natural que consisteix en determinar les expressions d'un discurs que es refereixen a la mateixa entitat del mon real. La tasca té un efecte directe en la minería de textos així com en moltes tasques de llenguatge natural que requereixin interpretació del discurs com resumidors, responedors de preguntes o traducció automàtica. Resoldre les correferències és essencial si es vol poder “entendre” un text o un discurs. Els objectius d'aquesta tesi es centren en la recerca en resolució de correferències amb aprenentatge automàtic. Concretament, els objectius de la recerca es centren en els següents camps: + Models de classificació: Els models de classificació més comuns a l'estat de l'art estan basats en la classificació independent de parelles de mencions. Més recentment han aparegut models que classifiquen grups de mencions. Un dels objectius de la tesi és incorporar el model entity-mention a l'aproximació desenvolupada. + Representació del problema: Encara no hi ha una representació definitiva del problema. En aquesta tesi es presenta una representació en hypergraf. + Algorismes de resolució. Depenent de la representació del problema i del model de classificació, els algorismes de ressolució poden ser molt diversos. Un dels objectius d'aquesta tesi és trobar un algorisme de resolució capaç d'utilitzar els models de classificació en la representació d'hypergraf. + Representació del coneixement: Per poder administrar coneixement de diverses fonts, cal una representació simbòlica i expressiva d'aquest coneixement. En aquesta tesi es proposa l'ús de restriccions. + Incorporació de coneixement del mon: Algunes correferències no es poden resoldre només amb informació lingüística. Sovint cal sentit comú i coneixement del mon per poder resoldre coreferències. En aquesta tesi es proposa un mètode per extreure coneixement del mon de Wikipedia i incorporar-lo al sistem de resolució. Les contribucions principals d'aquesta tesi son (i) una nova aproximació al problema de resolució de correferències basada en satisfacció de restriccions, fent servir un hypergraf per representar el problema, i resolent-ho amb l'algorisme relaxation labeling; i (ii) una recerca per millorar els resultats afegint informació del mon extreta de la Wikipedia. L'aproximació presentada pot fer servir els models mention-pair i entity-mention de forma combinada evitant així els problemes que es troben moltes altres aproximacions de l'estat de l'art com per exemple: contradiccions de classificacions independents, falta de context i falta d'informació. A més a més, l'aproximació presentada permet incorporar informació afegint restriccions i s'ha fet recerca per aconseguir afegir informació del mon que millori els resultats. RelaxCor, el sistema que ha estat implementat durant la tesi per experimentar amb l'aproximació proposada, ha aconseguit uns resultats comparables als millors que hi ha a l'estat de l'art. S'ha participat a les competicions internacionals SemEval-2010 i CoNLL-2011. RelaxCor va obtenir la segona posició al CoNLL-2010

    Adaptable formalism for the computational analysis of English noun phrase reference

    Get PDF

    On the Combination of Game-Theoretic Learning and Multi Model Adaptive Filters

    Get PDF
    This paper casts coordination of a team of robots within the framework of game theoretic learning algorithms. In particular a novel variant of fictitious play is proposed, by considering multi-model adaptive filters as a method to estimate other players’ strategies. The proposed algorithm can be used as a coordination mechanism between players when they should take decisions under uncertainty. Each player chooses an action after taking into account the actions of the other players and also the uncertainty. Uncertainty can occur either in terms of noisy observations or various types of other players. In addition, in contrast to other game-theoretic and heuristic algorithms for distributed optimisation, it is not necessary to find the optimal parameters a priori. Various parameter values can be used initially as inputs to different models. Therefore, the resulting decisions will be aggregate results of all the parameter values. Simulations are used to test the performance of the proposed methodology against other game-theoretic learning algorithms.</p

    Cognition-based approaches for high-precision text mining

    Get PDF
    This research improves the precision of information extraction from free-form text via the use of cognitive-based approaches to natural language processing (NLP). Cognitive-based approaches are an important, and relatively new, area of research in NLP and search, as well as linguistics. Cognitive approaches enable significant improvements in both the breadth and depth of knowledge extracted from text. This research has made contributions in the areas of a cognitive approach to automated concept recognition in. Cognitive approaches to search, also called concept-based search, have been shown to improve search precision. Given the tremendous amount of electronic text generated in our digital and connected world, cognitive approaches enable substantial opportunities in knowledge discovery. The generation and storage of electronic text is ubiquitous, hence opportunities for improved knowledge discovery span virtually all knowledge domains. While cognition-based search offers superior approaches, challenges exist due to the need to mimic, even in the most rudimentary way, the extraordinary powers of human cognition. This research addresses these challenges in the key area of a cognition-based approach to automated concept recognition. In addition it resulted in a semantic processing system framework for use in applications in any knowledge domain. Confabulation theory was applied to the problem of automated concept recognition. This is a relatively new theory of cognition using a non-Bayesian measure, called cogency, for predicting the results of human cognition. An innovative distance measure derived from cogent confabulation and called inverse cogency, to rank order candidate concepts during the recognition process. When used with a multilayer perceptron, it improved the precision of concept recognition by 5% over published benchmarks. Additional precision improvements are anticipated. These research steps build a foundation for cognition-based, high-precision text mining. Long-term it is anticipated that this foundation enables a cognitive-based approach to automated ontology learning. Such automated ontology learning will mimic human language cognition, and will, in turn, enable the practical use of cognitive-based approaches in virtually any knowledge domain --Abstract, page iii

    A Computational Model of Syntactic Processing: Ambiguity Resolution from Interpretation

    Get PDF
    Syntactic ambiguity abounds in natural language, yet humans have no difficulty coping with it. In fact, the process of ambiguity resolution is almost always unconscious. But it is not infallible, however, as example 1 demonstrates. 1. The horse raced past the barn fell. This sentence is perfectly grammatical, as is evident when it appears in the following context: 2. Two horses were being shown off to a prospective buyer. One was raced past a meadow. and the other was raced past a barn. ... Grammatical yet unprocessable sentences such as 1 are called `garden-path sentences.' Their existence provides an opportunity to investigate the human sentence processing mechanism by studying how and when it fails. The aim of this thesis is to construct a computational model of language understanding which can predict processing difficulty. The data to be modeled are known examples of garden path and non-garden path sentences, and other results from psycholinguistics. It is widely believed that there are two distinct loci of computation in sentence processing: syntactic parsing and semantic interpretation. One longstanding controversy is which of these two modules bears responsibility for the immediate resolution of ambiguity. My claim is that it is the latter, and that the syntactic processing module is a very simple device which blindly and faithfully constructs all possible analyses for the sentence up to the current point of processing. The interpretive module serves as a filter, occasionally discarding certain of these analyses which it deems less appropriate for the ongoing discourse than their competitors. This document is divided into three parts. The first is introductory, and reviews a selection of proposals from the sentence processing literature. The second part explores a body of data which has been adduced in support of a theory of structural preferences --- one that is inconsistent with the present claim. I show how the current proposal can be specified to account for the available data, and moreover to predict where structural preference theories will go wrong. The third part is a theoretical investigation of how well the proposed architecture can be realized using current conceptions of linguistic competence. In it, I present a parsing algorithm and a meaning-based ambiguity resolution method.Comment: 128 pages, LaTeX source compressed and uuencoded, figures separate macros: rotate.sty, lingmacros.sty, psfig.tex. Dissertation, Computer and Information Science Dept., October 199
    corecore