49 research outputs found

    Query-based extracting: how to support the answer?

    Get PDF
    Human-made query-based summaries commonly contain information not explicitly asked for. They answer the user query, but also provide supporting information. In order to find this information in the source text, a graph is used to model the strength and type of relations between sentences of the query and document cluster, based on various features. The resulting extracts rank second in overall readability in the DUC 2006 evaluation. Employment of better question answering methods is the key to improve also content-based evaluation results

    Query-Based Summarization using Rhetorical Structure Theory

    Get PDF
    Research on Question Answering is focused mainly on classifying the question type and finding the answer. Presenting the answer in a way that suits the user’s needs has received little attention. This paper shows how existing question answering systems—which aim at finding precise answers to questions—can be improved by exploiting summarization techniques to extract more than just the answer from the document in which the answer resides. This is done using a graph search algorithm which searches for relevant sentences in the discourse structure, which is represented as a graph. The Rhetorical Structure Theory (RST) is used to create a graph representation of a text document. The output is an extensive answer, which not only answers the question, but also gives the user an opportunity to assess the accuracy of the answer (is this what I am looking for?), and to find additional information that is related to the question, and which may satisfy an information need. This has been implemented in a working multimodal question answering system where it operates with two independently developed question answering modules

    Discourse oriented summarization

    Get PDF
    The meaning of text appears to be tightly related to intentions and circumstances. Context sensitivity of meaning is addressed by theories of discourse structure. Few attempts have been made to exploit text organization in summarization. This thesis is an exploration of what knowledge of discourse structure can do for content selection as a subtask of automatic summarization, and query-based summarization in particular. Query-based summarization is the task of answering an arbitrary user query or question by using content from potentially relevant sources. This thesis presents a general framework for discourse oriented summarization, relying on graphs to represent semantic relations in discourse, and redundancy as a special type of semantic relation. Semantic relations occur on several levels of text analysis (query-relevance, coherence, layout, etc.), and a broad range of textual features may be required to detect them. The graph-based framework facilitates combining multiple features into an integrated semantic model of the documents to summarize. Recognizing redundancy and entailment relations between text passages is particularly important when a summary is generated of multiple documents, e.g. to avoid including redundant content in a summary. For this reason, I pay particular attention to recognizing textual entailment. Within this framework, a three-fold evaluation is performed to evaluate different aspects of discourse oriented summarization. The first is a user study, measuring the effect on user appreciation of using a particular type of knowledge for query-based summarization. In this study, three presentation strategies are compared: summarization using the rhetorical structure of the source, a baseline summarization method which uses the layout of the source, and a baseline presentation method which uses no summarization but just a concise answer to the query. Results show that knowledge of the rhetorical structure not only helps to provide the necessary context for the user to verify that the summary addresses the query adequately, but also to increase the amount of relevant content. The second evaluation is a comparison of implementations of the graph-based framework which are capable of fully automatic summarization. The two variables in the experiment are the set of textual features used to model the source and the algorithm used to search a graph for relevant content. The features are based on cosine similarity, and are realized as graph representations of the source. The graph search algorithms are inspired by existing algorithms in summarization. The quality of summaries is measured using the Rouge evaluation toolkit. The best performer would have ranked first (Rouge-2) or second (Rouge-SU4) if it had participated in the DUC 2005 query-based summarization challenge. The third study is an evaluation in the context of the DUC 2006 summarization challenge, which includes readability measurements as well as various content-based evaluation metrics. The evaluated automatic discourse oriented summarization system is similar to the one described above, but uses additional features, i.e. layout and textual entailment. The system performed well on readability at the cost of content-based scores which were well below the scores of the highest ranking DUC 2006 participant. This indicates a trade-off between readable, coherent content and useful content, an issue yet to be explored. Previous research implies that theories of text organization generalize well to multimedia. This suggests that the discourse oriented summarization framework applies to summarizing multimedia as well, provided sufficient knowledge of the organization of the (multimedia) source documents is available. The last study in this thesis is an investigation of the applicability of structural relations in multimedia for generating picture-illustrated summaries, by relating summary content to picture-associated text (i.e. captions or surrounding paragraphs). Results suggest that captions are the more suitable annotation for selecting appropriate pictures. Compared to manual illustration, results of automatic pictures are similar if the manual picture is mainly decorative

    Image Retrieval Supports Multimedia Authoring

    Get PDF

    Normalized Alignment of Dependency Trees for Detecting Textual Entailment

    Get PDF
    In this paper, we investigate the usefulness of normalized alignment of dependency trees for entailment prediction. Overall, our approach yields an accuracy of 60% on the RTE2 test set, which is a significant improvement over the baseline. Results vary substantially across the different subsets, with a peak performance on the summarization data. We conclude that normalized alignment is useful for detecting textual entailments, but a robust approach will probably need to include additional sources of information

    Illustrating answers: an evaluation of automatically retrieved illustrations of answers to medical questions

    Get PDF
    In this paper we discuss and evaluate a method for automatic text illustration, applied to answers to medical questions. Our method for selecting illustrations is based on the idea that similarities between the answers and picture-related text (the picture’s caption or the section/paragraph that includes the picture) can be used as evidence that the picture would be appropriate to illustrate the answer.In a user study, participants rated answer presentations consisting of a textual component and a picture. The textual component was a manually written reference answer; the picture was automatically retrieved by measuring the similarity between the text and either the picture’s caption or its section. The caption-based selection method resulted in more attractive presentations than the section-based method; the caption-based method was also more consistent in selecting informative pictures and showed a greater correlation between user-rated informativeness and the confidence of relevance of the system.When compared to manually selected pictures, we found that automatically selected pictures were rated similarly to decorative pictures, but worse than informative pictures

    Production and evaluation of (multimodal) answers to medical questions

    Get PDF
    This paper describes two experiments carried out to investigate the production and evaluation of multimodal answer presentations in the context of a medical question answering system. In a production experiment participants had to produce answers to different types of questions. The results show that about one in four produced answers using multiple media. In an evaluation experiment, users had to evaluate different types of multimodal answer presentations. Answers with an informative visual were evaluated as more informative and more attractive than answers with a mere illustrative visual

    Kyoto: An Integrated System for Specific Domain WSD

    Get PDF
    This document describes the preliminary release of the integrated Kyoto system for specific domain WSD. The system uses concept miners (Tybots) to extract domain-related terms and produces a domain-related thesaurus, followed by knowledge-based WSD based on wordnet graphs (UKB). The resulting system can be applied to any language with a lexical knowledge base, and is based on publicly available software and resources. Our participation in Semeval task #17 focused on producing running systems for all languages in the task, and we attained good results in all except Chinese. Due to the pressure of the time-constraints in the competition, the system is still under development, and we expect results to improve in the near future

    Propuesta de modelo de negocio de un food truck de venta de desayunos en una universidad privada de Chiclayo, 2016

    Get PDF
    El presente trabajo tiene como objetivo establecer un modelo de negocio para un food truck de desayunos en una universidad privada de Chiclayo. La metodología aplicada para la investigación es cualitativa – exploratoria, se fundamenta en un proceso inductivo (explorar, describir y luego generar perspectivas teóricas), es decir va de lo particular a lo general; esta metodología permite obtener información en base a entrevistas realizadas a la comunidad universitaria. La investigación busca conocer la aceptación del modelo de food truck de venta de desayuno, se basó en el modelo Lean Canvas, desarrollado en el libro Running Lean de Ash Maurya, nos da un enfoque de nueve (9) dimensiones para tener en cuenta y poder lograr un modelo de negocio de éxito. La propuesta de valor obtenida, consiste en vender productos saludables que les ayude a promover la calidad y bienestar de la salud de nuestros clientes, por ello se ofrecerán desayunos elaborados a base de frutas, cereales andinos y sándwich preparados al instante, ofrecidos en unos envases biodegradables, cumpliendo con los estándares de salubridad. Asimismo se tendrá variedad en los productos a ofrecer, para que el cliente pueda escoger y se brindará una atención rápida y personalizada con la finalidad de cumplir con uno de los aspectos que los clientes valoran.Tesi