5 research outputs found

    An Effective Sentence Ordering Approach For Multi-Document Summarization Using Text Entailment

    Get PDF
    With the rapid development of modern technology electronically available textual information has increased to a considerable amount. Summarization of textual inform ation manually from unstructured text sources creates overhead to the user, therefore a systematic approach is required. Summarization is an approach that focuses on providing the user with a condensed version of the origina l text but in real time applicat ions extended document summarization is required for summarizing the text from multiple documents. The main focus of multi - document summarization is sentence ordering and ranking that arranges the collected sentences from multiple document in order to gene rate a well - organized summary. The improper order of extracted sentences significantly degrades readability and understandability of the summary. The existing system does multi document summarization by combining several preference measures such as chronology, probabilistic, precedence, succession, topical closeness experts to calculate the preference value between sentences. These approach to sent ence ordering and ranking does not address context based similarity measure between sentences which is very ess ential for effective summarization. The proposed system addresses this issues through textual entailment expert system. This approach builds an entailment model which incorpo rates the cause and effect between sentences in the documents using the symmetric measure such as cosine similarity and non - symmetric measures such as unigram match, bigram match, longest common sub - sequence, skip gram match, stemming. The proposed system is efficient in providing user with a contextual summary which significantly impro ves the readability and understandability of the final coherent summa

    Performance of Opinion Summarization towards Extractive Summarization

    Get PDF
    Opinion summarization summarizes opinion in texts while extractive summarization summarizes texts without considering opinion in the texts. Can opinion summarization be used to produce a better extractive summary? This paper proposes to determine the effectiveness of opinion summarization generation against extractive text summarization. Sentiment that includes emotion which indicates whether a sentence may be positive, negative or neutral is considered. Sentences that have strong sentiment, either positive or negative are deemed important in text summarization to capture the sentiments in a story text. Thus, a comparative study is conducted on two types of summarizations; opinion summarization using the proposed method, which uses two different sentiment lexicons: VADER and SentiWordNet against extractive summarization using established methods: Luhn, Latent Semantic Analysis (LSA) and LexRank. An experiment was performed on 20 news stories, comparing summaries generated by the proposed opinion summarization method against the summaries generated by established extractive summarization methods. From the experiment, the VADER sentiment analyzer produced the best score of 0.51 when evaluated against the LSA method using ROUGE-1 metric. This implies that opinion summarization converges with extractive summarization

    Bertsobot: gizaki-robot arteko komunikazio eta elkarrekintzarako portaerak

    Get PDF
    216 p.Bertsobot: Robot-Portaerak Gizaki-Robot Arteko Komunikazio eta ElkarrekintzanBertsotan aritzeko gaitasuna erakutsiko duen robot autonomoa garatzeada gure ikerketa-lanaren helburu behinena. Bere egitekoa, bertsoa osatzekoinstrukzioak ahoz jaso, hauek prozesatu eta ahalik eta bertsorik egokienaosatu eta kantatzea litzateke, bertsolarien oholtza gaineko adierazkortasunmaila erakutsiz gorputzarekin. Robot-bertsolariak, gizaki eta roboten artekoelkarrekintza eta komunikazioan aurrera egiteko modua jarri nahi luke, lengoaianaturala erabiliz robot-gizaki arteko bi noranzkoko komunikazioan

    Leveraging Semantic Annotations for Event-focused Search & Summarization

    Get PDF
    Today in this Big Data era, overwhelming amounts of textual information across different sources with a high degree of redundancy has made it hard for a consumer to retrospect on past events. A plausible solution is to link semantically similar information contained across the different sources to enforce a structure thereby providing multiple access paths to relevant information. Keeping this larger goal in view, this work uses Wikipedia and online news articles as two prominent yet disparate information sources to address the following three problems: • We address a linking problem to connect Wikipedia excerpts to news articles by casting it into an IR task. Our novel approach integrates time, geolocations, and entities with text to identify relevant documents that can be linked to a given excerpt. • We address an unsupervised extractive multi-document summarization task to generate a fixed-length event digest that facilitates efficient consumption of information contained within a large set of documents. Our novel approach proposes an ILP for global inference across text, time, geolocations, and entities associated with the event. • To estimate temporal focus of short event descriptions, we present a semi-supervised approach that leverages redundancy within a longitudinal news collection to estimate accurate probabilistic time models. Extensive experimental evaluations demonstrate the effectiveness and viability of our proposed approaches towards achieving the larger goal.Im heutigen Big Data Zeitalters existieren überwältigende Mengen an Textinformationen, die über mehrere Quellen verteilt sind und ein hohes Maß an Redundanz haben. Durch diese Gegebenheiten ist eine Retroperspektive auf vergangene Ereignisse für Konsumenten nur schwer möglich. Eine plausible Lösung ist die Verknüpfung semantisch ähnlicher, aber über mehrere Quellen verteilter Informationen, um dadurch eine Struktur zu erzwingen, die mehrere Zugriffspfade auf relevante Informationen, bietet. Vor diesem Hintergrund benutzt diese Dissertation Wikipedia und Onlinenachrichten als zwei prominente, aber dennoch grundverschiedene Informationsquellen, um die folgenden drei Probleme anzusprechen: • Wir adressieren ein Verknüpfungsproblem, um Wikipedia-Auszüge mit Nachrichtenartikeln zu verbinden und das Problem in eine Information-Retrieval-Aufgabe umzuwandeln. Unser neuartiger Ansatz integriert Zeit- und Geobezüge sowie Entitäten mit Text, um relevante Dokumente, die mit einem gegebenen Auszug verknüpft werden können, zu identifizieren. • Wir befassen uns mit einer unüberwachten Extraktionsmethode zur automatischen Zusammenfassung von Texten aus mehreren Dokumenten um Ereigniszusammenfassungen mit fester Länge zu generieren, was eine effiziente Aufnahme von Informationen aus großen Dokumentenmassen ermöglicht. Unser neuartiger Ansatz schlägt eine ganzzahlige lineare Optimierungslösung vor, die globale Inferenzen über Text, Zeit, Geolokationen und mit Ereignis-verbundenen Entitäten zieht. • Um den zeitlichen Fokus kurzer Ereignisbeschreibungen abzuschätzen, stellen wir einen semi-überwachten Ansatz vor, der die Redundanz innerhalb einer langzeitigen Dokumentensammlung ausnutzt, um genaue probabilistische Zeitmodelle abzuschätzen. Umfangreiche experimentelle Auswertungen zeigen die Wirksamkeit und Tragfähigkeit unserer vorgeschlagenen Ansätze zur Erreichung des größeren Ziels
    corecore