12 research outputs found

    Twitter Event Summarization Using Phrase Reinforcement Algorithm and NLP Features

    Get PDF
    Abstract-Now a day’s social networking sites are the fastest medium which delivers news to user as compare to the news paper and television. There so many social networking sites are present and one of them is Twitter. Twitter allows large no. of users to share/post their views, ideas on any particular event. According to recent survey daily 340 million Tweets are sent on Twitter which is on a different topic and only 4% of posts on Twitter have relevant news data. It is not possible for any human to read the posts to get meaningful information related to specific event. There is one solution to this problem i.e. we have to apply Summarization technique on it. In this paper we have used an algorithm which uses frequency count technique along with this we have also used some NLP features to summarize event specified by user. This automatic summarization algorithm handles the numerous, short, dissimilar, and noisy nature of tweets. We believe our novel approach helps users as well as researchers. DOI: 10.17762/ijritcc2321-8169.15020

    Complexity algorithm analysis for edit distance

    Get PDF
    Natural Language Processing (NLP) is a method which works on any language processing. Some of the algorithms are based on edit distance analysis. It is a process where the statistical calculations between two words or sentences are analyzed. Some of used edit distances for NLP are Levenshtein, Jaro Wrinkler, Soundex, N-grams, and Mahalanobis. The evaluation of edit distance is aimed to analyze the processing time of each edit distance in calculation of two different words or sentences. The objective of this paper is to evaluate the complexity of each distance, based on the time process

    A Statistical Approach with Syntactic and Semantic Features for Chinese Textual Entailment

    Get PDF
    [[abstract]]Recognizing Textual Entailment (RTE) is a PASCAL/TAC task in which two text fragments are processed by system to determine whether the meaning of hypothesis is entailed from another text or not. In this paper, we proposed a textual entailment system using a statistical approach that integrates syntactic and semantic techniques for Recognizing Inference in Text (RITE) using the NTCIR-9 RITE task and make a comparison between semantic and syntactic features based on their differences. We thoroughly evaluate our approach using subtasks of the NTCIR-9 RITE. As a result, our system achieved 73.28% accuracy on the Chinese Binary-Class (BC) subtask with NTCIR-9 RITE. Thorough experiments with the text fragments provided by the NTCIR-9 RITE task show that the proposed approach can significantly improve system accuracy.[[sponsorship]]IEEE[[incitationindex]]EI[[cooperationtype]]國外[[conferencetype]]國際[[conferencedate]]20120808~20120810[[booktype]]電子版[[iscallforpapers]]Y[[conferencelocation]]Vegas, Nevada, US

    Chinese Textual Entailment with Wordnet Semantic and Dependency Syntactic Analysis

    Get PDF
    Chun Tu and Min-Yuh Day (2013), "Chinese Textual Entailment with Wordnet Semantic and Dependency Syntactic Analysis", 2013 IEEE International Workshop on Empirical Methods for Recognizing Inference in Text (IEEE EM-RITE 2013), August 14, 2013, in Proceedings of the IEEE International Conference on Information Reuse and Integration (IEEE IRI 2013), San Francisco, California, USA, August 14-16, 2013, pp. 69-74.[[abstract]]Recognizing Inference in TExt (RITE) is a task for automatically detecting entailment, paraphrase, and contradiction in texts which addressing major text understanding in information access research areas. In this paper, we proposed a Chinese textual entailment system using Wordnet semantic and dependency syntactic approaches in Recognizing Inference in Text (RITE) using the NTCIR-10 RITE-2 subtask datasets. Wordnet is used to recognize entailment at lexical level. Dependency syntactic approach is a tree edit distance algorithm applied on the dependency trees of both the text and the hypothesis. We thoroughly evaluate our approach using NTCIR-10 RITE-2 subtask datasets. As a result, our system achieved 73.28% on Traditional Chinese Binary-Class (BC) subtask and 74.57% on Simplified Chinese Binary-Class subtask with NTCIR-10 RITE-2 development datasets. Thorough experiments with the text fragments provided by the NTCIR-10 RITE-2 subtask showed that the proposed approach can improve system's overall accuracy.[[sponsorship]]IEEE[[incitationindex]]EI[[conferencetype]]國際[[conferencedate]]20130814~20130816[[booktype]]電子版[[iscallforpapers]]Y[[conferencelocation]]San Francisco, US

    DCC&U: An Extended Digital Curation Lifecycle Model

    Get PDF
    The proliferation of Web, database and social networking technologies has enabled us to produce, publish and exchange digital assets at an enormous rate. This vast amount of information that is either digitized or born-digital needs to be collected, organized and preserved in a way that ensures that our digital assets and the information they carry remain available for future use. Digital curation has emerged as a new inter-disciplinary practice that seeks to set guidelines for disciplined management of information. In this paper we review two recent models for digital curation introduced by the Digital Curation Centre (DCC) and the Digital Curation Unit (DCU) of the Athena Research Centre. We then propose a fusion of the two models that highlights the need to extend the digital curation lifecycle by adding (a) provisions for the registration of usage experience, (b) a stage for knowledge enhancement and (c) controlled vocabularies used by convention to denote concepts, properties and relations. The objective of the proposed extensions is twofold: (i) to provide a more complete lifecycle model for the digital curation domain; and (ii) to provide a stimulus for a broader discussion on the research agenda

    Textual Entailment Using Lexical And Syntactic Similarity

    Full text link

    A Survey of Paraphrasing and Textual Entailment Methods

    Full text link
    Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of Informatics, Athens University of Economics and Business, Greece, 201
    corecore