11 research outputs found
Automatic summarising: factors and directions
This position paper suggests that progress with automatic summarising demands
a better research methodology and a carefully focussed research strategy. In
order to develop effective procedures it is necessary to identify and respond
to the context factors, i.e. input, purpose, and output factors, that bear on
summarising and its evaluation. The paper analyses and illustrates these
factors and their implications for evaluation. It then argues that this
analysis, together with the state of the art and the intrinsic difficulty of
summarising, imply a nearer-term strategy concentrating on shallow, but not
surface, text analysis and on indicative summarising. This is illustrated with
current work, from which a potentially productive research programme can be
developed
Towards Succinct and Relevant Image Descriptions
What does it mean to produce a good description of an image? Is a description good because it correctly identifies all of the objects in the image, because it describes the interesting attributes of the objects, or because it is short, yet informative? Griceâs Cooperative Principle, stated as âMake your contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged â (Grice, 1975), alongside other ideas of pragmatics in communication, have proven useful in thinking about language generation (Hovy, 1987; McKeown et al., 1995). The Cooperative Principle provides one possible framework for thinking about the generation and evaluation of image descriptions.1 The immediate question is whether automatic image description is within the scope of the Cooperative Principle. Consider the task of searching for images using natural language, where the purpose of the exchange is for the user to quickly and accurately find images that match their information needs. In this scenario, the user formulates a complete sentence query to express their needs, e.g. A sheepdog chasing sheep in a field, and initiates an exchange with the system in the form of a sequence of one-shot con-versations. In this exchange, both participants can describe images in natural language, and a successful outcome relies on each participant succinctly and correctly expressing their beliefs about the images. I
Generating multimedia briefings: coordinating language and illustration
AbstractCommunication can be more effective when several media (such as text, speech, or graphics) are integrated and coordinated to present information. This changes the nature of media-specific generation (e.g., language or graphics generation), which must take into account the multimedia context in which it occurs. This paper presents work on coordinating and integrating speech, text, static and animated three-dimensional graphics, and stored images, as part of several systems we have developed at Columbia University. A particular focus of our work has been on the generation of presentations that brief a user on information of interes
Automatic abstracting: a review and an empirical evaluation
The abstract is a fundamental tool in information retrieval. As condensed representations,
they facilitate conservation of the increasingly precious search time and space of scholars, allowing them to manage more effectively an ever-growing deluge of documentation.
Traditionally the product of human intellectual effort, attempts to automate the abstracting
process began in 1958. Two identifiable automatic abstracting techniques emerged which
reflect differing levels of ambition regarding simulation of the human abstracting process,
namely sentence extraction and text summarisation. This research paradigm has recently
diversified further, with a cross-fertilisation of methods. Commercial systems are beginning
to appear, but automatic abstracting is still mainly confined to an experimental arena.
The purpose of this study is firstly to chart the historical development and current state of
both manual and automatic abstracting; and secondly, to devise and implement an empirical
user-based evaluation to assess the adequacy of automatic abstracts derived from sentence
extraction techniques according to a set of utility criteria. [Continues.
Recommended from our members
Floating constraints in lexical choice
Lexical choice is a computationally complex task, requiring a generation system to consider a potentially large number of mappings between concepts and words. Constraints that aid in determining which word is best come from a wide variety of sources, including syntax, semantics, pragmatics, the lexicon, and the underlying domain. Furthermore, in some situations, different constraints come into play early on, while in others, they apply much later. This makes it difficult to determine a systematic ordering in which to apply constraints. In this paper, we present a general approach to lexical choice that can handle multiple, interacting constraints. We focus on the problem of floating constraints, semantic or pragmatic constraints that float, appearing at a variety of different syntactic ranks, often merged with other semantic constraints. This means that multiple content units can be realized by a single surface element, and conversely, that a single content unit can be realized by a variety of surface elements. Our approach uses the Functional Unification Formalism (FUF) to represent a generation lexicon, allowing for declarative and compositional representation of individual constraints
Generating automated meeting summaries
The thesis at hand introduces a novel approach for the generation of abstractive summaries of meetings. While the automatic generation of document summaries has been studied for some decades now, the novelty of this thesis is mainly the application to the meeting domain (instead of text documents) as well as the use of a lexicalized representation formalism on the basis of Frame Semantics. This allows us to generate summaries abstractively (instead of extractively).Die vorliegende Arbeit stellt einen neuartigen Ansatz zur Generierung abstraktiver Zusammenfassungen von Gruppenbesprechungen vor. WĂ€hrend automatische Textzusammenfassungen bereits seit einigen Jahrzehnten erforscht werden, liegt die Neuheit dieser Arbeit vor allem in der AnwendungsdomĂ€ne (Gruppenbesprechungen statt Textdokumenten), sowie der Verwendung eines lexikalisierten ReprĂ€sentationsformulism auf der Basis von Frame-Semantiken, der es erlaubt, Zusammenfassungen abstraktiv (statt extraktiv) zu generieren. Wir argumentieren, dass abstraktive AnsĂ€tze fĂŒr die Zusammenfassung spontansprachlicher Interaktionen besser geeignet sind als extraktive
Feasibility of using citations as document summaries
The purpose of this research is to establish whether it is feasible to use citations as document summaries. People are good at creating and selecting summaries and are generally the standard for evaluating computer generated summaries. Citations can be characterized as concept symbols or short summaries of the document they are citing. Similarity metrics have been used in retrieval and text summarization to determine how alike two documents are. Similarity metrics have never been compared to what human subjects think are similar between two documents. If similarity metrics reflect human judgment, then we can mechanize the selection of citations that act as short summaries of the document they are citing. The research approach was to gather rater data comparing document abstracts to citations about the same document and then to statistically compare those results to several document metrics; frequency count, similarity metric, citation location and type of citation. There were two groups of raters, subject experts and non-experts. Both groups of raters were asked to evaluate seven parameters between abstract and citations: purpose, subject matter, methods, conclusions, findings, implications, readability, andunderstandability. The rater was to identify how strongly the citation represented the content of the abstract, on a five point likert scale. Document metrics were collected for frequency count, cosine, and similarity metric between abstracts and associated citations. In addition, data was collected on the location of the citations and the type of citation. Location was identified and dummy coded for introduction, method, discussion, review of the literature and conclusion. Citations were categorized and dummy coded for whether they refuted, noted, supported, reviewed, or applied information about the cited document. The results show there is a relationship between some similarity metrics and human judgment of similarity.Ph.D., Information Studies -- Drexel University, 200