42,149 research outputs found
Matrix completion with queries
In many applications, e.g., recommender systems and traffic monitoring, the
data comes in the form of a matrix that is only partially observed and low
rank. A fundamental data-analysis task for these datasets is matrix completion,
where the goal is to accurately infer the entries missing from the matrix. Even
when the data satisfies the low-rank assumption, classical matrix-completion
methods may output completions with significant error -- in that the
reconstructed matrix differs significantly from the true underlying matrix.
Often, this is due to the fact that the information contained in the observed
entries is insufficient. In this work, we address this problem by proposing an
active version of matrix completion, where queries can be made to the true
underlying matrix. Subsequently, we design Order&Extend, which is the first
algorithm to unify a matrix-completion approach and a querying strategy into a
single algorithm. Order&Extend is able identify and alleviate insufficient
information by judiciously querying a small number of additional entries. In an
extensive experimental evaluation on real-world datasets, we demonstrate that
our algorithm is efficient and is able to accurately reconstruct the true
matrix while asking only a small number of queries.Comment: Proceedings of the 21th ACM SIGKDD International Conference on
Knowledge Discovery and Data Minin
Near-optimal asymmetric binary matrix partitions
We study the asymmetric binary matrix partition problem that was recently
introduced by Alon et al. (WINE 2013) to model the impact of asymmetric
information on the revenue of the seller in take-it-or-leave-it sales.
Instances of the problem consist of an binary matrix and a
probability distribution over its columns. A partition scheme
consists of a partition for each row of . The partition acts
as a smoothing operator on row that distributes the expected value of each
partition subset proportionally to all its entries. Given a scheme that
induces a smooth matrix , the partition value is the expected maximum
column entry of . The objective is to find a partition scheme such that
the resulting partition value is maximized. We present a -approximation
algorithm for the case where the probability distribution is uniform and a
-approximation algorithm for non-uniform distributions, significantly
improving results of Alon et al. Although our first algorithm is combinatorial
(and very simple), the analysis is based on linear programming and duality
arguments. In our second result we exploit a nice relation of the problem to
submodular welfare maximization.Comment: 17 page
Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge
Traditional semantic parsers map language onto compositional, executable
queries in a fixed schema. This mapping allows them to effectively leverage the
information contained in large, formal knowledge bases (KBs, e.g., Freebase) to
answer questions, but it is also fundamentally limiting---these semantic
parsers can only assign meaning to language that falls within the KB's
manually-produced schema. Recently proposed methods for open vocabulary
semantic parsing overcome this limitation by learning execution models for
arbitrary language, essentially using a text corpus as a kind of knowledge
base. However, all prior approaches to open vocabulary semantic parsing replace
a formal KB with textual information, making no use of the KB in their models.
We show how to combine the disparate representations used by these two
approaches, presenting for the first time a semantic parser that (1) produces
compositional, executable representations of language, (2) can successfully
leverage the information contained in both a formal KB and a large corpus, and
(3) is not limited to the schema of the underlying KB. We demonstrate
significantly improved performance over state-of-the-art baselines on an
open-domain natural language question answering task.Comment: Re-written abstract and intro, other minor changes throughout. This
version published at AAAI 201
- …