Search CORE

956 research outputs found

APRIORI ALGORITHM APPROACH FOR AUTOMATIC TEXT PROCESSING AND GENERIC-BASED SUMMARIZATION SYSTEM

Author: Imeh U.
Osang F. B.
Publication venue: Faculty of Engineering and Technology, Ladoke Akintola University of Technology, Ogbomoso, Nigeria
Publication date: 19/03/2019
Field of study

Text Processing has always existed in various forms. It makes voluminous text easily digestible, offers brief and quick overview of the subject contents and may provide critical context analysis to the reader. With the growth of digital articles in forms of news, blogs, wikis etc., there is serious need for a text processor which can adequately summarized an article or documents for the reader. This redirected and takes away the effort needed to read, assimilate and create summaries manually. This research paper proposed a system which provides unique opportunity for developing a core set text summarization system using Apriori Algorithm techniques to perform Binary Associated Rule on Data Mining. The system makes available a means of storing the automatic Generic-based summaries for future references and requirements

LAUTECH Journal of Engineering and Technology (LAUJET)

Symbolic and Visual Retrieval of Mathematical Notation using Formula Graph Symbol Pair Matching and Structural Alignment

Author: Davila Castellanos Kenny
Publication venue: RIT Scholar Works
Publication date: 01/07/2017
Field of study

Large data collections containing millions of math formulae in different formats are available on-line. Retrieving math expressions from these collections is challenging. We propose a framework for retrieval of mathematical notation using symbol pairs extracted from visual and semantic representations of mathematical expressions on the symbolic domain for retrieval of text documents. We further adapt our model for retrieval of mathematical notation on images and lecture videos. Graph-based representations are used on each modality to describe math formulas. For symbolic formula retrieval, where the structure is known, we use symbol layout trees and operator trees. For image-based formula retrieval, since the structure is unknown we use a more general Line of Sight graph representation. Paths of these graphs define symbol pairs tuples that are used as the entries for our inverted index of mathematical notation. Our retrieval framework uses a three-stage approach with a fast selection of candidates as the first layer, a more detailed matching algorithm with similarity metric computation in the second stage, and finally when relevance assessments are available, we use an optional third layer with linear regression for estimation of relevance using multiple similarity scores for final re-ranking. Our model has been evaluated using large collections of documents, and preliminary results are presented for videos and cross-modal search. The proposed framework can be adapted for other domains like chemistry or technical diagrams where two visually similar elements from a collection are usually related to each other

RIT Scholar Works