976 research outputs found

    Generic Text Summarization for Turkish

    Full text link

    A comparison of feature and semantic-based summarization algorithms for Turkish

    Get PDF
    Akyokuş, Selim (Dogus Author) -- Conference full title: International Symposium on Innovations in Intelligent Systems and Applicaitons, 21-24June 2010, Kayseri & Cappadocia,TURKEY.In this paper we analyze the performances of a feature-based and two semantic-based text summarization algorithms on a new Turkish corpus. The feature-based algorithm uses the statistical analysis of paragraphs, sentences, words and formal clues found in documents, whereas the two semanticbased algorithms employ Latent Semantic Analysis (LSA) approach which enables the selection of the most important sentences in a semantic way. Performance evaluation is conducted by comparing automatically generated summaries with manual summaries generated by a human summarizer. This is the first study that applies LSA based algorithms to Turkish text summarization and its results are promising

    Text Summarization Techniques: A Brief Survey

    Get PDF
    In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.Comment: Some of references format have update

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    A Novel ILP Framework for Summarizing Content with High Lexical Variety

    Full text link
    Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.Comment: Accepted for publication in the journal of Natural Language Engineering, 201
    corecore