75 research outputs found

    Multi-document Summarization System Using Rhetorical Information

    Get PDF
    Over the past 20 years, research in automated text summarization has grown significantly in the field of natural language processing. The massive availability of scientific and technical information on the Internet, including journals, conferences, and news articles has attracted the interest of various groups of researchers working in text summarization. These researchers include linguistics, biologists, database researchers, and information retrieval experts. However, because the information available on the web is ever expanding, reading the sheer volume of information is a significant challenge. To deal with this volume of information, users need appropriate summaries to help them more efficiently manage their information needs. Although many automated text summarization systems have been proposed in the past twenty years, none of these systems have incorporated the use of rhetoric. To date, most automated text summarization systems have relied only on statistical approaches. These approaches do not take into account other features of language such as antimetabole and epanalepsis. Our hypothesis is that rhetoric can provide this type of additional information. This thesis addresses these issues by investigating the role of rhetorical figuration in detecting the salient information in texts. We show that automated multi-document summarization can be improved using metrics based on rhetorical figuration. A corpus of presidential speeches, which is for different U.S. presidents speeches, has been created. It includes campaign, state of union, and inaugural speeches to test our proposed multi-document summarization system. Various evaluation metrics have been used to test and compare the performance of the produced summaries of both our proposed system and other system. Our proposed multi-document summarization system using rhetorical figures improves the produced summaries, and achieves better performance over MEAD system in most of the cases especially in antimetabole, polyptoton, and isocolon. Overall, the results of our system are promising and leads to future progress on this research

    Collocation in Rhetorical Figures: A Case Study in Parison, Epanaphora and Homoioptoton

    Get PDF
    This paper is a pilot study on the collocation of rhetorical figures, or when more than one figure occurs in a single instance. It examines examples from rhetorical figure handbooks, which only look at figures individually. Detection algorithms for parison, epanaphora, and homoioptoton are developed and run against the handbook examples to check for collocation. The findings suggest that figures of parallelism are more cognitively salient than lexical repetition, but that marked figures like antimetabole, polysyndeton, and asyndeton are more salient than figures of parallelism

    Argumentative zoning information extraction from scientific text

    Get PDF
    Let me tell you, writing a thesis is not always a barrel of laughs—and strange things can happen, too. For example, at the height of my thesis paranoia, I had a re-current dream in which my cat Amy gave me detailed advice on how to restructure the thesis chapters, which was awfully nice of her. But I also had a lot of human help throughout this time, whether things were going fine or beserk. Most of all, I want to thank Marc Moens: I could not have had a better or more knowledgable supervisor. He always took time for me, however busy he might have been, reading chapters thoroughly in two days. He both had the calmness of mind to give me lots of freedom in research, and the right judgement to guide me away, tactfully but determinedly, from the occasional catastrophe or other waiting along the way. He was great fun to work with and also became a good friend. My work has profitted from the interdisciplinary, interactive and enlightened atmosphere at the Human Communication Centre and the Centre for Cognitive Science (which is now called something else). The Language Technology Group was a great place to work in, as my research was grounded in practical applications develope

    A Performance Guide to Kurt Erickson\u27s Song Cycle Here, Bullet

    Get PDF
    The purpose of this document is to supply a comprehensive performer’s guide to American composer Kurt Erickson’s Here, Bullet, a song cycle consisting of four songs for baritone and piano set to the poetry of American poet Brian Turner. Additionally, an overview of its unique consortium-based commissioning process will be included in examination of the entrepreneurial nature of its creation. Here, Bullet focuses on the soldier’s interaction with the bullet, suicide, foreign lands, and deployment in Iraq. The text comes from a book of poetry, which originated the song cycle’s name, and was written during Turner’s yearlong deployment to Iraq with the United States Army. The setting of the four pieces engages the singer and the listener in a visceral sound scape that draws out the obvious and underlying conflict in the poetry and presents a stark look at the cold realities of war. The baritone voice is used in all its facets and colors, and the piano acts as support, atmosphere, and an emotional character throughout the cycle. Unique to the Here, Bullet song cycle is its commissioning consortium created by Erickson. Instead of having a single commissioner, Erickson disseminated music and compositional insight to different baritones over social media in exchange for promised performances within a year of its creation. Giving agency to the singer for performance and advertising, there are dozens of worldwide performances of the cycle throughout the season after its full creation in August 2019

    A New Model of Interpreting Modified Strophic Design: Brahms’s Late Viennese Solo Lieder

    Get PDF
    The study of the interaction between musical and text-based elements in songs has received a great amount of attention in recent years. Although numerous previous researchers have contributed to the study of Brahms’s Lieder, more work remains to explore Brahms’s various compositional techniques of modification to reveal the music-text relationship and performance implications in his strophic songs. Since the majority of Brahms’s modified strophic songs were composed during his later Viennese period (1875–97), this dissertation offers a thorough analysis of Brahms’s late twenty-eight strophic solo Lieder and develops a new formal model to categorize them as one of four types of modified strophic form (hereafter MSF): Type-1 MSF: Slight Modifications; Type-2 MSF: Changes to Phrase Rhythm; Type-3 MSF: Changes of Key; and Type-4 MSF: Significant Modifications. This dissertation employs varied analytical methods and approaches (e.g., hypermetrical reconstructions and voice-leading reductions) to establish four primary strategies by which to interpret Brahms’s modified strophic design. To make the meaning of the poem more closely integrated with Brahms’s musical setting, I suggest that performers should display different emphases and interpretations to reflect Brahms’s changes to melody, accompaniment, phrase rhythm, and harmonization. Thus, my thorough analysis attempts to motivate researchers to rethink and connect Brahms’s solo Lieder with these four areas of music-theoretical attention—analysis and performance, rhythm and meter, large-scale organization, and music-text relations. This in-depth exploration into Brahms’s compositional tendencies during his later Viennese period will facilitate future research into additional solo modified strophic Lieder of both Brahms and his contemporaries

    Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation

    Get PDF
    Peer reviewe

    On the Principles of Evaluation for Natural Language Generation

    Get PDF
    Natural language processing is concerned with the ability of computers to understand natural language texts, which is, arguably, one of the major bottlenecks in the course of chasing the holy grail of general Artificial Intelligence. Given the unprecedented success of deep learning technology, the natural language processing community has been almost entirely in favor of practical applications with state-of-the-art systems emerging and competing for human-parity performance at an ever-increasing pace. For that reason, fair and adequate evaluation and comparison, responsible for ensuring trustworthy, reproducible and unbiased results, have fascinated the scientific community for long, not only in natural language but also in other fields. A popular example is the ISO-9126 evaluation standard for software products, which outlines a wide range of evaluation concerns, such as cost, reliability, scalability, security, and so forth. The European project EAGLES-1996, being the acclaimed extension to ISO-9126, depicted the fundamental principles specifically for evaluating natural language technologies, which underpins succeeding methodologies in the evaluation of natural language. Natural language processing encompasses an enormous range of applications, each with its own evaluation concerns, criteria and measures. This thesis cannot hope to be comprehensive but particularly addresses the evaluation in natural language generation (NLG), which touches on, arguably, one of the most human-like natural language applications. In this context, research on quantifying day-to-day progress with evaluation metrics lays the foundation of the fast-growing NLG community. However, previous works have failed to address high-quality metrics in multiple scenarios such as evaluating long texts and when human references are not available, and, more prominently, these studies are limited in scope, given the lack of a holistic view sketched for principled NLG evaluation. In this thesis, we aim for a holistic view of NLG evaluation from three complementary perspectives, driven by the evaluation principles in EAGLES-1996: (i) high-quality evaluation metrics, (ii) rigorous comparison of NLG systems for properly tracking the progress, and (iii) understanding evaluation metrics. To this end, we identify the current state of challenges derived from the inherent characteristics of these perspectives, and then present novel metrics, rigorous comparison approaches, and explainability techniques for metrics to address the identified issues. We hope that our work on evaluation metrics, system comparison and explainability for metrics inspires more research towards principled NLG evaluation, and contributes to the fair and adequate evaluation and comparison in natural language processing

    Automatic Question Generation to Support Reading Comprehension of Learners - Content Selection, Neural Question Generation, and Educational Evaluation

    Get PDF
    Simply reading texts passively without actively engaging with their content is suboptimal for text comprehension since learners may miss crucial concepts or misunderstand essential ideas. In contrast, engaging learners actively by asking questions fosters text comprehension. However, educational resources frequently lack questions. Textbooks often contain only a few at the end of a chapter, and informal learning resources such as Wikipedia lack them entirely. Thus, in this thesis, we study to what extent questions about educational science texts can be automatically generated, tackling two research questions. The first question concerns selecting learning-relevant passages to guide the generation process. The second question investigates the generated questions' potential effects and applicability in reading comprehension scenarios. Our first contribution improves the understanding of neural question generation's quality in education. We find that the generators' high linguistic quality transfers to educational texts but that they require guidance by educational content selection. In consequence, we study multiple educational context and answer selection mechanisms. In our second contribution, we propose novel context selection approaches which target question-worthy sentences in texts. In contrast to previous works, our context selectors are guided by educational theory. The proposed methods perform competitive to related work while operating with educationally motivated decision criteria that are easier to understand for educational experts. The third contribution addresses answer selection methods to guide neural question generation with expected answers. Our experiments highlight the need for educational corpora for the task. Models trained on noneducational corpora do not transfer well to the educational domain. Given this discrepancy, we propose a novel corpus construction approach. It automatically derives educational answer selection corpora from textbooks. We verify the approach's usefulness by showing that neural models trained on the constructed corpora learn to detect learning-relevant concepts. In our last contribution, we use the insights from the previous experiments to design, implement, and evaluate an automatic question generator for educational use. We evaluate the proposed generator intrinsically with an expert annotation study and extrinsically with an empirical reading comprehension study. The two evaluation scenarios provide a nuanced view of the generated questions' strengths and weaknesses. Expert annotations attribute an educational value to roughly 60 % of the questions but also reveal various ways in which the questions still fall short of the quality experts desire. Furthermore, the reader-based evaluation indicates that the proposed educational question generator increases learning outcomes compared to a no-question control group. In summary, the results of the thesis improve the understanding of the content selection tasks in educational question generation and provide evidence that it can improve reading comprehension. As such, the proposed approaches are promising tools for authors and learners to promote active reading and thus foster text comprehension
    • …
    corecore