16,826 research outputs found
Finding co-solvers on Twitter, with a little help from Linked Data
In this paper we propose a method for suggesting potential collaborators for solving innovation challenges online, based on their competence, similarity of interests and social proximity with the user. We rely on Linked Data to derive a measure of semantic relatedness that we use to enrich both user profiles and innovation problems with additional relevant topics, thereby improving the performance of co-solver recommendation. We evaluate this approach against state of the art methods for query enrichment based on the distribution of topics in user profiles, and demonstrate its usefulness in recommending collaborators that are both complementary in competence and compatible with the user. Our experiments are grounded using data from the social networking service Twitter.com
Word-Entity Duet Representations for Document Ranking
This paper presents a word-entity duet framework for utilizing knowledge
bases in ad-hoc retrieval. In this work, the query and documents are modeled by
word-based representations and entity-based representations. Ranking features
are generated by the interactions between the two representations,
incorporating information from the word space, the entity space, and the
cross-space connections through the knowledge graph. To handle the
uncertainties from the automatically constructed entity representations, an
attention-based ranking model AttR-Duet is developed. With back-propagation
from ranking labels, the model learns simultaneously how to demote noisy
entities and how to rank documents with the word-entity duet. Evaluation
results on TREC Web Track ad-hoc task demonstrate that all of the four-way
interactions in the duet are useful, the attention mechanism successfully
steers the model away from noisy entities, and together they significantly
outperform both word-based and entity-based learning to rank systems
Combining Text and Formula Queries in Math Information Retrieval: Evaluation of Query Results Merging Strategies
Specific to Math Information Retrieval is combining text with mathematical
formulae both in documents and in queries. Rigorous evaluation of query
expansion and merging strategies combining math and standard textual keyword
terms in a query are given. It is shown that techniques similar to those known
from textual query processing may be applied in math information retrieval as
well, and lead to a cutting edge performance. Striping and merging partial
results from subqueries is one technique that improves results measured by
information retrieval evaluation metrics like Bpref
Mining Entity Synonyms with Efficient Neural Set Generation
Mining entity synonym sets (i.e., sets of terms referring to the same entity)
is an important task for many entity-leveraging applications. Previous work
either rank terms based on their similarity to a given query term, or treats
the problem as a two-phase task (i.e., detecting synonymy pairs, followed by
organizing these pairs into synonym sets). However, these approaches fail to
model the holistic semantics of a set and suffer from the error propagation
issue. Here we propose a new framework, named SynSetMine, that efficiently
generates entity synonym sets from a given vocabulary, using example sets from
external knowledge bases as distant supervision. SynSetMine consists of two
novel modules: (1) a set-instance classifier that jointly learns how to
represent a permutation invariant synonym set and whether to include a new
instance (i.e., a term) into the set, and (2) a set generation algorithm that
enumerates the vocabulary only once and applies the learned set-instance
classifier to detect all entity synonym sets in it. Experiments on three real
datasets from different domains demonstrate both effectiveness and efficiency
of SynSetMine for mining entity synonym sets.Comment: AAAI 2019 camera-ready versio
Lazy Model Expansion: Interleaving Grounding with Search
Finding satisfying assignments for the variables involved in a set of
constraints can be cast as a (bounded) model generation problem: search for
(bounded) models of a theory in some logic. The state-of-the-art approach for
bounded model generation for rich knowledge representation languages, like ASP,
FO(.) and Zinc, is ground-and-solve: reduce the theory to a ground or
propositional one and apply a search algorithm to the resulting theory.
An important bottleneck is the blowup of the size of the theory caused by the
reduction phase. Lazily grounding the theory during search is a way to overcome
this bottleneck. We present a theoretical framework and an implementation in
the context of the FO(.) knowledge representation language. Instead of
grounding all parts of a theory, justifications are derived for some parts of
it. Given a partial assignment for the grounded part of the theory and valid
justifications for the formulas of the non-grounded part, the justifications
provide a recipe to construct a complete assignment that satisfies the
non-grounded part. When a justification for a particular formula becomes
invalid during search, a new one is derived; if that fails, the formula is
split in a part to be grounded and a part that can be justified.
The theoretical framework captures existing approaches for tackling the
grounding bottleneck such as lazy clause generation and grounding-on-the-fly,
and presents a generalization of the 2-watched literal scheme. We present an
algorithm for lazy model expansion and integrate it in a model generator for
FO(ID), a language extending first-order logic with inductive definitions. The
algorithm is implemented as part of the state-of-the-art FO(ID) Knowledge-Base
System IDP. Experimental results illustrate the power and generality of the
approach
The Partial Evaluation Approach to Information Personalization
Information personalization refers to the automatic adjustment of information
content, structure, and presentation tailored to an individual user. By
reducing information overload and customizing information access,
personalization systems have emerged as an important segment of the Internet
economy. This paper presents a systematic modeling methodology - PIPE
(`Personalization is Partial Evaluation') - for personalization.
Personalization systems are designed and implemented in PIPE by modeling an
information-seeking interaction in a programmatic representation. The
representation supports the description of information-seeking activities as
partial information and their subsequent realization by partial evaluation, a
technique for specializing programs. We describe the modeling methodology at a
conceptual level and outline representational choices. We present two
application case studies that use PIPE for personalizing web sites and describe
how PIPE suggests a novel evaluation criterion for information system designs.
Finally, we mention several fundamental implications of adopting the PIPE model
for personalization and when it is (and is not) applicable.Comment: Comprehensive overview of the PIPE model for personalizatio
Generalized Second Price Auction with Probabilistic Broad Match
Generalized Second Price (GSP) auctions are widely used by search engines
today to sell their ad slots. Most search engines have supported broad match
between queries and bid keywords when executing GSP auctions, however, it has
been revealed that GSP auction with the standard broad-match mechanism they are
currently using (denoted as SBM-GSP) has several theoretical drawbacks (e.g.,
its theoretical properties are known only for the single-slot case and
full-information setting, and even in this simple setting, the corresponding
worst-case social welfare can be rather bad). To address this issue, we propose
a novel broad-match mechanism, which we call the Probabilistic Broad-Match
(PBM) mechanism. Different from SBM that puts together the ads bidding on all
the keywords matched to a given query for the GSP auction, the GSP with PBM
(denoted as PBM-GSP) randomly samples a keyword according to a predefined
probability distribution and only runs the GSP auction for the ads bidding on
this sampled keyword. We perform a comprehensive study on the theoretical
properties of the PBM-GSP. Specifically, we study its social welfare in the
worst equilibrium, in both full-information and Bayesian settings. The results
show that PBM-GSP can generate larger welfare than SBM-GSP under mild
conditions. Furthermore, we also study the revenue guarantee for PBM-GSP in
Bayesian setting. To the best of our knowledge, this is the first work on
broad-match mechanisms for GSP that goes beyond the single-slot case and the
full-information setting
- …