Search CORE

1,984 research outputs found

Proximity Full-Text Search with a Response Time Guarantee by Means of Additional Indexes

Author: AB Veretennikov
AB Veretennikov
AB Veretennikov
G Zipf
HE Williams
Justin Zobel
Matthew Chang
S Gugnani
Sergey Brin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Full-text search engines are important tools for information retrieval. Term proximity is an important factor in relevance score measurement. In a proximity full-text search, we assume that a relevant document contains query terms near each other, especially if the query terms are frequently occurring words. A methodology for high-performance full-text query execution is discussed. We build additional indexes to achieve better efficiency. For a word that occurs in the text, we include in the indexes some information about nearby words. What types of additional indexes do we use? How do we use them? These questions are discussed in this work. We present the results of experiments showing that the average time of search query execution is 44-45 times less than that required when using ordinary inverted indexes. This is a pre-print of a contribution "Veretennikov A.B. Proximity Full-Text Search with a Response Time Guarantee by Means of Additional Indexes" published in "Arai K., Kapoor S., Bhatia R. (eds) Intelligent Systems and Applications. IntelliSys 2018. Advances in Intelligent Systems and Computing, vol 868" published by Springer, Cham. The final authenticated version is available online at: https://doi.org/10.1007/978-3-030-01054-6_66. The work was supported by Act 211 Government of the Russian Federation, contract no 02.A03.21.0006.Comment: Alexander B. Veretennikov. Chair of Calculation Mathematics and Computer Science, INSM. Ural Federal Universit

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Algorithms and Data Structures for In-Memory Text Search Engines

Author: Transier Frederik
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2010
Field of study

KITopen

A user evaluation of hierarchical phrase browsing

Author: Edgar Katrina D.
Nichols David M.
Paynter Gordon W.
Thomson Kirsten
Witten Ian H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Phrase browsing interfaces based on hierarchies of phrases extracted automatically from document collections offer a useful compromise between automatic full-text searching and manually-created subject indexes. The literature contains descriptions of such systems that many find compelling and persuasive. However, evaluation studies have either been anecdotal, or focused on objective measures of the quality of automatically-extracted index terms, or restricted to questions of computational efficiency and feasibility. This paper reports on an empirical, controlled user study that compares hierarchical phrase browsing with full-text searching over a range of information seeking tasks. Users found the results located via phrase browsing to be relevant and useful but preferred keyword searching for certain types of queries. Users experiences were marred by interface details, including inconsistencies between the phrase browser and the surrounding digital library interface

CiteSeerX

Crossref

Research Commons@Waikato

Path Queries on Compressed XML

Author: Abiteboul
Ailamaki
Batory
Bryant
Buneman
Burch
Chan
Deutsch
Fernandez
Florescu
Frick
Goldman
Gottlob
Liefke
McMillan
Milo
Neumüller
Shanmugasundaram
Tolani
Ziv
Publication venue
Publication date: 01/01/2003
Field of study

Central to any XML query language is a path language such as XPath which operates on the tree structure of the XML document. We demonstrate in this paper that the tree structure can be e#ectively compressed and manipulated using techniques derived from symbolic model checking . Specifically, we show first that succinct representations of document tree structures based on sharing subtrees are highly e#ective. Second, we show that compressed structures can be queried directly and e#ciently through a process of manipulating selections of nodes and partial decompression

CiteSeerX

Crossref

Edinburgh Research Explorer

Towards a query language for annotation graphs

Author: Bird Steven
Buneman Peter
Tan Wang-Chiew
Publication venue
Publication date: 01/01/2000
Field of study

The multidimensional, heterogeneous, and temporal nature of speech databases raises interesting challenges for representation and query. Recently, annotation graphs have been proposed as a general-purpose representational framework for speech databases. Typical queries on annotation graphs require path expressions similar to those used in semistructured query languages. However, the underlying model is rather different from the customary graph models for semistructured data: the graph is acyclic and unrooted, and both temporal and inclusion relationships are important. We develop a query language and describe optimization techniques for an underlying relational representation.Comment: 8 pages, 10 figure

arXiv.org e-Print Archive

CiteSeerX

Edinburgh Research Explorer

ScholarlyCommons@Penn

MT techniques in a retrieval system of semantically enriched patents

Author: Enache Ramona
España Bonet Cristina
González Bermúdez Meritxell
Mateva Maria
Màrquez Villodre Lluís
Popov Borislav
Ranta Aarne
Publication venue
Publication date: 01/01/2013
Field of study

This paper focuses on how automatic translation techniques integrated in a patent retrieval system increase its capabilities and make possible extended features and functionalities. We describe 1) a novel methodology for natural language to SPARQL translation based on a grammar– ontology interoperability automation and a query grammar for the patents domain; 2) a devised strategy for statisticalbased translation of patents that allows to transfer semantic annotations to the target language; 3) a built-in knowledge representation infrastructure that uses multilingual semantic annotations; and 4) an online application that offers a multilingual search interface over structural knowledge databases (domain ontologies) and multilingual documents (biomedical patents) that have been automatically translated.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC