2,112 research outputs found
Math Search for the Masses: Multimodal Search Interfaces and Appearance-Based Retrieval
We summarize math search engines and search interfaces produced by the
Document and Pattern Recognition Lab in recent years, and in particular the min
math search interface and the Tangent search engine. Source code for both
systems are publicly available. "The Masses" refers to our emphasis on creating
systems for mathematical non-experts, who may be looking to define unfamiliar
notation, or browse documents based on the visual appearance of formulae rather
than their mathematical semantics.Comment: Paper for Invited Talk at 2015 Conference on Intelligent Computer
Mathematics (July, Washington DC
FAQchat as in Information Retrieval system
A chatbot is a conversational agent that interacts with users through natural languages. In this paper, we describe a new way to access information using a chatbot. The FAQ in the School of Computing at the University of Leeds has been used to retrain the ALICE chatbot system, producing FAQchat. The results returned from FAQchat are similar to ones generated by search engines such as Google. For evaluation, a comparison was made between FAQchat and Google. The main objective is to demonstrate that FAQchat is a viable alternative to Google and it can be used as a tool to access FAQ databases
A class of structured P2P systems supporting browsing
Browsing is a way of finding documents in a large amount of data which is
complementary to querying and which is particularly suitable for multimedia
documents. Locating particular documents in a very large collection of
multimedia documents such as the ones available in peer to peer networks is a
difficult task. However, current peer to peer systems do not allow to do this
by browsing. In this report, we show how one can build a peer to peer system
supporting a kind of browsing. In our proposal, one must extend an existing
distributed hash table system with a few features : handling partial hash-keys
and providing appropriate routing mechanisms for these hash-keys. We give such
an algorithm for the particular case of the Tapestry distributed hash table.
This is a work in progress as no proper validation has been done yet.Comment: 14 page
The NASA Astrophysics Data System: The Search Engine and its User Interface
The ADS Abstract and Article Services provide access to the astronomical
literature through the World Wide Web (WWW). The forms based user interface
provides access to sophisticated searching capabilities that allow our users to
find references in the fields of Astronomy, Physics/Geophysics, and
astronomical Instrumentation and Engineering. The returned information includes
links to other on-line information sources, creating an extensive astronomical
digital library. Other interfaces to the ADS databases provide direct access to
the ADS data to allow developers of other data systems to integrate our data
into their system.
The search engine is a custom-built software system that is specifically
tailored to search astronomical references. It includes an extensive synonym
list that contains discipline specific knowledge about search term
equivalences.
Search request logs show the usage pattern of the various search system
capabilities. Access logs show the world-wide distribution of ADS users.
The ADS can be accessed at http://adswww.harvard.eduComment: 23 pages, 18 figures, 11 table
Upper and lower bounds for dynamic data structures on strings
We consider a range of simply stated dynamic data structure problems on
strings. An update changes one symbol in the input and a query asks us to
compute some function of the pattern of length and a substring of a longer
text. We give both conditional and unconditional lower bounds for variants of
exact matching with wildcards, inner product, and Hamming distance computation
via a sequence of reductions. As an example, we show that there does not exist
an time algorithm for a large range of these problems
unless the online Boolean matrix-vector multiplication conjecture is false. We
also provide nearly matching upper bounds for most of the problems we consider.Comment: Accepted at STACS'1
Symbolic and Visual Retrieval of Mathematical Notation using Formula Graph Symbol Pair Matching and Structural Alignment
Large data collections containing millions of math formulae in different formats are available on-line. Retrieving math expressions from these collections is challenging. We propose a framework for retrieval of mathematical notation using symbol pairs extracted from visual and semantic representations of mathematical expressions on the symbolic domain for retrieval of text documents. We further adapt our model for retrieval of mathematical notation on images and lecture videos. Graph-based representations are used on each modality to describe math formulas. For symbolic formula retrieval, where the structure is known, we use symbol layout trees and operator trees. For image-based formula retrieval, since the structure is unknown we use a more general Line of Sight graph representation. Paths of these graphs define symbol pairs tuples that are used as the entries for our inverted index of mathematical notation. Our retrieval framework uses a three-stage approach with a fast selection of candidates as the first layer, a more detailed matching algorithm with similarity metric computation in the second stage, and finally when relevance assessments are available, we use an optional third layer with linear regression for estimation of relevance using multiple similarity scores for final re-ranking. Our model has been evaluated using large collections of documents, and preliminary results are presented for videos and cross-modal search. The proposed framework can be adapted for other domains like chemistry or technical diagrams where two visually similar elements from a collection are usually related to each other
- âŠ