Search CORE

1,440 research outputs found

Efficient and Effective Query Auto-Completion

Author: Fano R. M.
Krishnan U.
Martinez-Prieto M. A.
Pibiri G. E.
Pibiri G. E.
Plaisance J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/06/2020
Field of study

Query Auto-Completion (QAC) is an ubiquitous feature of modern textual search systems, suggesting possible ways of completing the query being typed by the user. Efficiency is crucial to make the system have a real-time responsiveness when operating in the million-scale search space. Prior work has extensively advocated the use of a trie data structure for fast prefix-search operations in compact space. However, searching by prefix has little discovery power in that only completions that are prefixed by the query are returned. This may impact negatively the effectiveness of the QAC system, with a consequent monetary loss for real applications like Web Search Engines and eCommerce. In this work we describe the implementation that empowers a new QAC system at eBay, and discuss its efficiency/effectiveness in relation to other approaches at the state-of-the-art. The solution is based on the combination of an inverted index with succinct data structures, a much less explored direction in the literature. This system is replacing the previous implementation based on Apache SOLR that was not always able to meet the required service-level-agreement.Comment: Published in SIGIR 202

arXiv.org e-Print Archive

Crossref

Incremental construction of minimal acyclic finite-state automata

Author: Daciuk Jan
Mihov Stoyan
Watson Bruce
Watson Richard
Publication venue
Publication date: 01/01/2000
Field of study

In this paper, we describe a new method for constructing minimal, deterministic, acyclic finite-state automata from a set of strings. Traditional methods consist of two phases: the first to construct a trie, the second one to minimize it. Our approach is to construct a minimal automaton in a single phase by adding new strings one by one and minimizing the resulting automaton on-the-fly. We present a general algorithm as well as a specialization that relies upon the lexicographical ordering of the input strings.Comment: 14 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Proceedings - University of Groningen

Crossref

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

The Complexity of Order Type Isomorphism

Author: Aloupis Greg
Iacono John
Langerman Stefan
Wuhrer Stefanie S.
Özkan Özgür
Publication venue
Publication date: 01/01/2013
Field of study

The order type of a point set in

R^d

maps each

(d{+}1)

-tuple of points to its orientation (e.g., clockwise or counterclockwise in

R^2

). Two point sets

X

and

Y

have the same order type if there exists a mapping

f

from

X

Y

for which every

(d{+}1)

-tuple

(a_1,a_2,\ldots,a_{d+1})

X

and the corresponding tuple

(f(a_1),f(a_2),\ldots,f(a_{d+1}))

Y

have the same orientation. In this paper we investigate the complexity of determining whether two point sets have the same order type. We provide an

O(n^d)

algorithm for this task, thereby improving upon the

O(n^{\lfloor{3d/2}\rfloor})

algorithm of Goodman and Pollack (1983). The algorithm uses only order type queries and also works for abstract order types (or acyclic oriented matroids). Our algorithm is optimal, both in the abstract setting and for realizable points sets if the algorithm only uses order type queries.Comment: Preliminary version of paper to appear at ACM-SIAM Symposium on Discrete Algorithms (SODA14

arXiv.org e-Print Archive

Crossref

DI-fusion

Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts

Author: A. Amir
E.W. Myers
G. Navarro
G. Navarro
G. Navarro
G.M. Landau
J. Kärkkäinen
J. Ziv
J. Ziv
K. Thompson
M. Dietzfelbinger
M. Farach
P. Sellers
R. Cole
T.A. Welch
V. Mäkinen
Publication venue
Publication date: 01/01/2007
Field of study

We study the approximate string matching and regular expression matching problem for the case when the text to be searched is compressed with the Ziv-Lempel adaptive dictionary compression schemes. We present a time-space trade-off that leads to algorithms improving the previously known complexities for both problems. In particular, we significantly improve the space bounds, which in practical applications are likely to be a bottleneck

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Southern Denmark Research Output

Online Research Database In Technology