17 research outputs found

    Using Synchronic and Diachronic Relations for Summarizing Multiple Documents Describing Evolving Events

    Full text link
    In this paper we present a fresh look at the problem of summarizing evolving events from multiple sources. After a discussion concerning the nature of evolving events we introduce a distinction between linearly and non-linearly evolving events. We present then a general methodology for the automatic creation of summaries from evolving events. At its heart lie the notions of Synchronic and Diachronic cross-document Relations (SDRs), whose aim is the identification of similarities and differences between sources, from a synchronical and diachronical perspective. SDRs do not connect documents or textual elements found therein, but structures one might call messages. Applying this methodology will yield a set of messages and relations, SDRs, connecting them, that is a graph which we call grid. We will show how such a grid can be considered as the starting point of a Natural Language Generation System. The methodology is evaluated in two case-studies, one for linearly evolving events (descriptions of football matches) and another one for non-linearly evolving events (terrorist incidents involving hostages). In both cases we evaluate the results produced by our computational systems.Comment: 45 pages, 6 figures. To appear in the Journal of Intelligent Information System

    Developing a corpus of strategic conversation in The Settlers of Catan

    Get PDF
    International audienceWe describe a dialogue model and an implemented annotation scheme for a pilot corpus of annotated online chats concerning bargaining negotiations in the game The Settlers of Catan. We will use this model and data to analyze how conversations proceed in the absence of strong forms of cooperativity, where agents have diverging motives. Here we concentrate on the description of our annotation scheme for negotiation dialogues, illustrated with our pilot data, and some perspectives for future research on the issue

    Splitting Arabic Texts into Elementary Discourse Units

    Get PDF
    International audienceIn this article, we propose the first work that investigates the feasibility of Arabic discourse segmentation into elementary discourse units within the segmented discourse representation theory framework. We first describe our annotation scheme that defines a set of principles to guide the segmentation process. Two corpora have been annotated according to this scheme: elementary school textbooks and newspaper documents extracted from the syntactically annotated Arabic Treebank. Then, we propose a multiclass supervised learning approach that predicts nested units. Our approach uses a combination of punctuation, morphological, lexical, and shallow syntactic features. We investigate how each feature contributes to the learning process. We show that an extensive morphological analysis is crucial to achieve good results in both corpora. In addition, we show that adding chunks does not boost the performance of our system

    Summarization from medical documents: A survey

    No full text
    Objective: The aim of this paper is to survey the recent work in medical documents summarization. Background: During the last decade, documents summarization got increasing attention by the AI research community. More recently it also attracted the interest of the medical research community as well, due to the enormous growth of information that is available to the physicians and researchers in medicine, through the large and growing number of published journals, conference proceedings, medical sites and portals on the World Wide Web, electronic medical records, etc. Methodology: This survey gives first a general background on documents summarization, presenting the factors that summarization depends upon, discussing evaluation issues and describing briefly the various types of summarization techniques. It then examines the characteristics of the medical domain through the different types of medical documents. Finally, it presents and discusses the summarization techniques used so far in the medical domain, referring to the corresponding systems and their characteristics. Discussion and conclusions: The paper discusses thoroughly the promising paths for future research in medical documents summarization. It mainly focuses on the issue of scaling to large collections of documents in various languages and from different media, on personalization issues, on portability to new sub-domains, and on the integration of summarization technology in practical applications. © 2004 Elsevier B.V. All rights reserved

    A Prototype Crowdsourcing Approach for Document Summarization Service

    No full text
    Part IV: ICT and Emerging TechnologiesInternational audienceThis paper proposes a crowdsourcing approach for informative document summarization service. It first captures the task of summarizing a lengthy document as a bi-objective combinatorial optimization problem. One objective function to be minimized is the time to comprehend the summary, and the other one to be maximized is the amount of information content remaining in it. The solution space of the problem is composed of various combinations of candidate condensed elements covering the whole document as a set. Since it is not easy for a computer algorithm to create condensed elements of different lengths which are natural and easy for a human to comprehend, as well as to evaluate the two objective functions for any possible summary, these sub-tasks are crowdsourced to human contributors. The rest of the approach is handled by a computer algorithm. How the approach functions is tested by a laboratory experiment using a pilot system implemented as a web application

    Evaluating Web search result summaries

    No full text
    Abstract. The aim of our research is to produce and assess short summaries to aid users ’ relevance judgements, for example for a search engine result page. In this paper we present our new metric for measuring summary quality based on representativeness and judgeability, and compare the summary quality of our system to that of Google. We discuss the basis for constructing our evaluation methodology in contrast to previous relevant open evaluations, arguing that the elements which make up an evaluation methodology: the tasks, data and metrics, are interdependent and the way in which they are combined is critical to the effectiveness of the methodology. The paper discusses the relationship between these three factors as implemented in our own work, as well as in SUMMAC/MUC/DUC.

    Deep Reinforcement Learning in Strategic Board Game Environments

    No full text
    International audienceIn this paper we propose a novel Deep Reinforcement Learning (DRL) algorithm that uses the concept of “action-dependent state features”, and exploits it to approximate the Q-values locally, employing a deep neural network with parallel Long Short Term Memory (LSTM) components, each one responsible for computing an action-related Q-value. As such, all computations occur simultaneously, and there is no need to employ “target” networks and experience replay, which are techniques regularly used in the DRL literature. Moreover, our algorithm does not require previous training experiences, but trains itself online during game play. We tested our approach in the Settlers Of Catan multi-player strategic board game. Our results confirm the effectiveness of our approach, since it outperforms several competitors, including the state-of-the-art jSettler heuristic algorithm devised for this particular domain
    corecore