956 research outputs found

    APRIORI ALGORITHM APPROACH FOR AUTOMATIC TEXT PROCESSING AND GENERIC-BASED SUMMARIZATION SYSTEM

    Get PDF
    Text Processing has always existed in various forms. It makes voluminous text easily digestible, offers brief and quick overview of the subject contents and may provide critical context analysis to the reader. With the growth of digital articles in forms of news, blogs, wikis etc., there is serious need for a text processor which can adequately summarized an article or documents for the reader. This redirected and takes away the effort needed to read, assimilate and create summaries manually. This research paper proposed a system which provides unique opportunity for developing a core set text summarization system using Apriori Algorithm techniques to perform Binary Associated Rule on Data Mining. The system makes available a means of storing the automatic Generic-based summaries for future references and requirements

    Symbolic and Visual Retrieval of Mathematical Notation using Formula Graph Symbol Pair Matching and Structural Alignment

    Get PDF
    Large data collections containing millions of math formulae in different formats are available on-line. Retrieving math expressions from these collections is challenging. We propose a framework for retrieval of mathematical notation using symbol pairs extracted from visual and semantic representations of mathematical expressions on the symbolic domain for retrieval of text documents. We further adapt our model for retrieval of mathematical notation on images and lecture videos. Graph-based representations are used on each modality to describe math formulas. For symbolic formula retrieval, where the structure is known, we use symbol layout trees and operator trees. For image-based formula retrieval, since the structure is unknown we use a more general Line of Sight graph representation. Paths of these graphs define symbol pairs tuples that are used as the entries for our inverted index of mathematical notation. Our retrieval framework uses a three-stage approach with a fast selection of candidates as the first layer, a more detailed matching algorithm with similarity metric computation in the second stage, and finally when relevance assessments are available, we use an optional third layer with linear regression for estimation of relevance using multiple similarity scores for final re-ranking. Our model has been evaluated using large collections of documents, and preliminary results are presented for videos and cross-modal search. The proposed framework can be adapted for other domains like chemistry or technical diagrams where two visually similar elements from a collection are usually related to each other
    • …
    corecore