7,582,139 research outputs found
Cell-Probe Lower Bounds from Online Communication Complexity
In this work, we introduce an online model for communication complexity.
Analogous to how online algorithms receive their input piece-by-piece, our
model presents one of the players, Bob, his input piece-by-piece, and has the
players Alice and Bob cooperate to compute a result each time before the next
piece is revealed to Bob. This model has a closer and more natural
correspondence to dynamic data structures than classic communication models do,
and hence presents a new perspective on data structures.
We first present a tight lower bound for the online set intersection problem
in the online communication model, demonstrating a general approach for proving
online communication lower bounds. The online communication model prevents a
batching trick that classic communication complexity allows, and yields a
stronger lower bound. We then apply the online communication model to prove
data structure lower bounds for two dynamic data structure problems: the Group
Range problem and the Dynamic Connectivity problem for forests. Both of the
problems admit a worst case -time data structure. Using online
communication complexity, we prove a tight cell-probe lower bound for each:
spending (even amortized) time per operation results in at best an
probability of correctly answering a
-fraction of the queries
Balancing Speed and Quality in Online Learning to Rank for Information Retrieval
In Online Learning to Rank (OLTR) the aim is to find an optimal ranking model
by interacting with users. When learning from user behavior, systems must
interact with users while simultaneously learning from those interactions.
Unlike other Learning to Rank (LTR) settings, existing research in this field
has been limited to linear models. This is due to the speed-quality tradeoff
that arises when selecting models: complex models are more expressive and can
find the best rankings but need more user interactions to do so, a requirement
that risks frustrating users during training. Conversely, simpler models can be
optimized on fewer interactions and thus provide a better user experience, but
they will converge towards suboptimal rankings. This tradeoff creates a
deadlock, since novel models will not be able to improve either the user
experience or the final convergence point, without sacrificing the other. Our
contribution is twofold. First, we introduce a fast OLTR model called Sim-MGD
that addresses the speed aspect of the speed-quality tradeoff. Sim-MGD ranks
documents based on similarities with reference documents. It converges rapidly
and, hence, gives a better user experience but it does not converge towards the
optimal rankings. Second, we contribute Cascading Multileave Gradient Descent
(C-MGD) for OLTR that directly addresses the speed-quality tradeoff by using a
cascade that enables combinations of the best of two worlds: fast learning and
high quality final convergence. C-MGD can provide the better user experience of
Sim-MGD while maintaining the same convergence as the state-of-the-art MGD
model. This opens the door for future work to design new models for OLTR
without having to deal with the speed-quality tradeoff.Comment: CIKM 2017, Proceedings of the 2017 ACM on Conference on Information
and Knowledge Managemen
Thread Reconstruction in Conversational Data using Neural Coherence Models
Discussion forums are an important source of information. They are often used
to answer specific questions a user might have and to discover more about a
topic of interest. Discussions in these forums may evolve in intricate ways,
making it difficult for users to follow the flow of ideas. We propose a novel
approach for automatically identifying the underlying thread structure of a
forum discussion. Our approach is based on a neural model that computes
coherence scores of possible reconstructions and then selects the highest
scoring, i.e., the most coherent one. Preliminary experiments demonstrate
promising results outperforming a number of strong baseline methods.Comment: Neu-IR: Workshop on Neural Information Retrieval 201
Semantic Entity Retrieval Toolkit
Unsupervised learning of low-dimensional, semantic representations of words
and entities has recently gained attention. In this paper we describe the
Semantic Entity Retrieval Toolkit (SERT) that provides implementations of our
previously published entity representation models. The toolkit provides a
unified interface to different representation learning algorithms, fine-grained
parsing configuration and can be used transparently with GPUs. In addition,
users can easily modify existing models or implement their own models in the
framework. After model training, SERT can be used to rank entities according to
a textual query and extract the learned entity/word representation for use in
downstream algorithms, such as clustering or recommendation.Comment: SIGIR 2017 Workshop on Neural Information Retrieval (Neu-IR'17). 201
- …
