Search CORE

1,708 research outputs found

Which one is better: presentation-based or content-based math search?

Author: A.S. Youssef
B.R. Miller
J. Mišutka
M. Adeel
M. Kohlhase
M.E. Altamimi
M.Q. Nghiem
R. Miner
R. Zanibbi
S. Kamali
Publication venue
Publication date: 01/01/2014
Field of study

Mathematical content is a valuable information source and retrieving this content has become an important issue. This paper compares two searching strategies for math expressions: presentation-based and content-based approaches. Presentation-based search uses state-of-the-art math search system while content-based search uses semantic enrichment of math expressions to convert math expressions into their content forms and searching is done using these content-based expressions. By considering the meaning of math expressions, the quality of search system is improved over presentation-based systems

arXiv.org e-Print Archive

CiteSeerX

Crossref

Efficient Document Re-Ranking for Transformers by Precomputing Term Representations

Author: Frieder Ophir
Goharian Nazli
MacAvaney Sean
Nardini Franco Maria
Perego Raffaele
Tonellotto Nicola
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/05/2020
Field of study

Deep pretrained transformer networks are effective at various ranking tasks, such as question answering and ad-hoc document ranking. However, their computational expenses deem them cost-prohibitive in practice. Our proposed approach, called PreTTR (Precomputing Transformer Term Representations), considerably reduces the query-time latency of deep transformer networks (up to a 42x speedup on web document ranking) making these networks more practical to use in a real-time ranking scenario. Specifically, we precompute part of the document term representations at indexing time (without a query), and merge them with the query representation at query time to compute the final ranking score. Due to the large size of the token representations, we also propose an effective approach to reduce the storage requirement by training a compression layer to match attention scores. Our compression technique reduces the storage required up to 95% and it can be applied without a substantial degradation in ranking performance.Comment: Accepted at SIGIR 2020 (long

arXiv.org e-Print Archive

Crossref

Content And Multimedia Database Management Systems

Author: Vries Arjen Paul de
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/1999
Field of study

A database management system is a general-purpose software system that facilitates the processes of defining, constructing, and manipulating databases for various applications. The main characteristic of the ‘database approach’ is that it increases the value of data by its emphasis on data independence. DBMSs, and in particular those based on the relational data model, have been very successful at the management of administrative data in the business domain. This thesis has investigated data management in multimedia digital libraries, and its implications on the design of database management systems. The main problem of multimedia data management is providing access to the stored objects. The content structure of administrative data is easily represented in alphanumeric values. Thus, database technology has primarily focused on handling the objects’ logical structure. In the case of multimedia data, representation of content is far from trivial though, and not supported by current database management systems

CiteSeerX

VU Research Portal

CWI's Institutional Repository

University of Twente Research Information

Search Agent Model: a Conceptual Framework for Search by Algorithms and Agent Systems

Author: Dalton Jeff
Foley John
Publication venue
Publication date: 19/08/2018
Field of study

No abstract available

Search Agent Model: a Conceptual Framework for Search by Algorithms and Agent Systems

Author: Dalton Jeff
Foley John
Publication venue
Publication date: 19/08/2018
Field of study

No abstract available

Enlighten

Structured Knowledge Representation for Image Retrieval

Author: Di Sciascio E.
Donini F. M.
Mongiello M.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

We propose a structured approach to the problem of retrieval of images by content and present a description logic that has been devised for the semantic indexing and retrieval of images containing complex objects. As other approaches do, we start from low-level features extracted with image analysis to detect and characterize regions in an image. However, in contrast with feature-based approaches, we provide a syntax to describe segmented regions as basic objects and complex objects as compositions of basic ones. Then we introduce a companion extensional semantics for defining reasoning services, such as retrieval, classification, and subsumption. These services can be used for both exact and approximate matching, using similarity measures. Using our logical approach as a formal specification, we implemented a complete client-server image retrieval system, which allows a user to pose both queries by sketch and queries by example. A set of experiments has been carried out on a testbed of images to assess the retrieval capabilities of the system in comparison with expert users ranking. Results are presented adopting a well-established measure of quality borrowed from textual information retrieval

arXiv.org e-Print Archive

Crossref

Chinese-English Cross-Lingual Information Retrieval in Biomedicine Using Ontology-Based Query Expansion

Author: Wang Xinkai
Publication venue
Publication date: 01/08/2012
Field of study

The University of Manchester - Institutional Repository

Using Search Term Positions for Determining Document Relevance

Author: Galeas Patricio
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2010
Field of study

The technological advancements in computer networks and the substantial reduction of their production costs have caused a massive explosion of digitally stored information. In particular, textual information is becoming increasingly available in electronic form. Finding text documents dealing with a certain topic is not a simple task. Users need tools to sift through non-relevant information and retrieve only pieces of information relevant to their needs. The traditional methods of information retrieval (IR) based on search term frequency have somehow reached their limitations, and novel ranking methods based on hyperlink information are not applicable to unlinked documents. The retrieval of documents based on the positions of search terms in a document has the potential of yielding improvements, because other terms in the environment where a search term appears (i.e. the neighborhood) are considered. That is to say, the grammatical type, position and frequency of other words help to clarify and specify the meaning of a given search term. However, the required additional analysis task makes position-based methods slower than methods based on term frequency and requires more storage to save the positions of terms. These drawbacks directly affect the performance of the most user critical phase of the retrieval process, namely query evaluation time, which explains the scarce use of positional information in contemporary retrieval systems. This thesis explores the possibility of extending traditional information retrieval systems with positional information in an efficient manner that permits us to optimize the retrieval performance by handling term positions at query evaluation time. To achieve this task, several abstract representation of term positions to efficiently store and operate on term positional data are investigated. In the Gauss model, descriptive statistics methods are used to estimate term positional information, because they minimize outliers and irregularities in the data. The Fourier model is based on Fourier series to represent positional information. In the Hilbert model, functional analysis methods are used to provide reliable term position estimations and simple mathematical operators to handle positional data. The proposed models are experimentally evaluated using standard resources of the IR research community (Text Retrieval Conference). All experiments demonstrate that the use of positional information can enhance the quality of search results. The suggested models outperform state-of-the-art retrieval utilities. The term position models open new possibilities to analyze and handle textual data. For instance, document clustering and compression of positional data based on these models could be interesting topics to be considered in future research

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg