12,666 research outputs found
Deriving query suggestions for site search
Modern search engines have been moving away from simplistic interfaces that aimed at satisfying a user's need with a single-shot query. Interactive features are now integral parts of web search engines. However, generating good query modification suggestions remains a challenging issue. Query log analysis is one of the major strands of work in this direction. Although much research has been performed on query logs collected on the web as a whole, query log analysis to enhance search on smaller and more focused collections has attracted less attention, despite its increasing practical importance. In this article, we report on a systematic study of different query modification methods applied to a substantial query log collected on a local website that already uses an interactive search engine. We conducted experiments in which we asked users to assess the relevance of potential query modification suggestions that have been constructed using a range of log analysis methods and different baseline approaches. The experimental results demonstrate the usefulness of log analysis to extract query modification suggestions. Furthermore, our experiments demonstrate that a more fine-grained approach than grouping search requests into sessions allows for extraction of better refinement terms from query log files. © 2013 ASIS&T
A Local Algorithm for Constructing Spanners in Minor-Free Graphs
Constructing a spanning tree of a graph is one of the most basic tasks in
graph theory. We consider this problem in the setting of local algorithms: one
wants to quickly determine whether a given edge is in a specific spanning
tree, without computing the whole spanning tree, but rather by inspecting the
local neighborhood of . The challenge is to maintain consistency. That is,
to answer queries about different edges according to the same spanning tree.
Since it is known that this problem cannot be solved without essentially
viewing all the graph, we consider the relaxed version of finding a spanning
subgraph with edges (where is the number of vertices and
is a given sparsity parameter). It is known that this relaxed
problem requires inspecting edges in general graphs, which
motivates the study of natural restricted families of graphs. One such family
is the family of graphs with an excluded minor. For this family there is an
algorithm that achieves constant success probability, and inspects
edges (for each edge it is queried
on), where is the maximum degree in the graph and is the size of the
excluded minor. The distances between pairs of vertices in the spanning
subgraph are at most a factor of larger than in
.
In this work, we show that for an input graph that is -minor free for any
of size , this task can be performed by inspecting only edges. The distances between pairs of vertices in the spanning
subgraph are at most a factor of larger
than in . Furthermore, the error probability of the new algorithm is
significantly improved to . This algorithm can also be easily
adapted to yield an efficient algorithm for the distributed setting
Towards a Universal Wordnet by Learning from Combined Evidenc
Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification
TopicViz: Semantic Navigation of Document Collections
When people explore and manage information, they think in terms of topics and
themes. However, the software that supports information exploration sees text
at only the surface level. In this paper we show how topic modeling -- a
technique for identifying latent themes across large collections of documents
-- can support semantic exploration. We present TopicViz, an interactive
environment for information exploration. TopicViz combines traditional search
and citation-graph functionality with a range of novel interactive
visualizations, centered around a force-directed layout that links documents to
the latent themes discovered by the topic model. We describe several use
scenarios in which TopicViz supports rapid sensemaking on large document
collections
SODA: Generating SQL for Business Users
The purpose of data warehouses is to enable business analysts to make better
decisions. Over the years the technology has matured and data warehouses have
become extremely successful. As a consequence, more and more data has been
added to the data warehouses and their schemas have become increasingly
complex. These systems still work great in order to generate pre-canned
reports. However, with their current complexity, they tend to be a poor match
for non tech-savvy business analysts who need answers to ad-hoc queries that
were not anticipated. This paper describes the design, implementation, and
experience of the SODA system (Search over DAta Warehouse). SODA bridges the
gap between the business needs of analysts and the technical complexity of
current data warehouses. SODA enables a Google-like search experience for data
warehouses by taking keyword queries of business users and automatically
generating executable SQL. The key idea is to use a graph pattern matching
algorithm that uses the metadata model of the data warehouse. Our results with
real data from a global player in the financial services industry show that
SODA produces queries with high precision and recall, and makes it much easier
for business users to interactively explore highly-complex data warehouses.Comment: VLDB201
- …