Search CORE

9,360 research outputs found

Rank, select and access in grammar-compressed strings

Author: Belazzougui Djamal
Puglisi Simon J.
Tabei Yasuo
Publication venue
Publication date: 14/08/2014
Field of study

Given a string

S

of length

N

on a fixed alphabet of

\sigma

symbols, a grammar compressor produces a context-free grammar

G

of size

n

that generates

S

and only

S

. In this paper we describe data structures to support the following operations on a grammar-compressed string: \mbox{rank}_c(S,i) (return the number of occurrences of symbol

c

before position

i

S

); \mbox{select}_c(S,i) (return the position of the

i

th occurrence of

c

S

); and \mbox{access}(S,i,j) (return substring

S[i,j]

). For rank and select we describe data structures of size

O(n\sigma\log N)

bits that support the two operations in

O(\log N)

time. We propose another structure that uses

O(n\sigma\log (N/n)(\log N)^{1+\epsilon})

bits and that supports the two queries in

O(\log N/\log\log N)

, where

\epsilon>0

is an arbitrary constant. To our knowledge, we are the first to study the asymptotic complexity of rank and select in the grammar-compressed setting, and we provide a hardness result showing that significantly improving the bounds we achieve would imply a major breakthrough on a hard graph-theoretical problem. Our main result for access is a method that requires

O(n\log N)

bits of space and

O(\log N+m/\log_\sigma N)

time to extract

m=j-i+1

consecutive symbols from

S

. Alternatively, we can achieve

O(\log N/\log\log N+m/\log_\sigma N)

query time using

O(n\log (N/n)(\log N)^{1+\epsilon})

bits of space. This matches a lower bound stated by Verbin and Yu for strings where

N

is polynomially related to

n

.Comment: 16 page

arXiv.org e-Print Archive

CiteSeerX

Implementation of an efficient Fuzzy Logic based Information Retrieval System

Author: Agarwal Shubham
Dhawan Sumit
Singh Prabhjot
Thakur Narina
Publication venue
Publication date: 13/03/2015
Field of study

This paper exemplifies the implementation of an efficient Information Retrieval (IR) System to compute the similarity between a dataset and a query using Fuzzy Logic. TREC dataset has been used for the same purpose. The dataset is parsed to generate keywords index which is used for the similarity comparison with the user query. Each query is assigned a score value based on its fuzzy similarity with the index keywords. The relevant documents are retrieved based on the score value. The performance and accuracy of the proposed fuzzy similarity model is compared with Cosine similarity model using Precision-Recall curves. The results prove the dominance of Fuzzy Similarity based IR system.Comment: arXiv admin note: substantial text overlap with http://ntz-develop.blogspot.in/ , http://www.micsymposium.org/mics2012/submissions/mics2012_submission_8.pdf , http://www.slideshare.net/JeffreyStricklandPhD/predictive-modeling-and-analytics-selectchapters-41304405 by other author

arXiv.org e-Print Archive

Directory of Open Access Journals