15,322 research outputs found
Recommended from our members
On Learning vs. Refutation
Building on the work of Daniely et al. (STOC 2014, COLT 2016), we study the connection between computationally efficient PAC learning and refutation of constraint satisfaction problems. Specifically, we prove that for every concept class P, PAC-learning P is polynomially equivalent to ârandom-right-hand-side-refutingâ (âRRHS-refutingâ) a dual class P â , where RRHS-refutation of a class Q refers to refuting systems of equations where the constraints are (worst-case) functions from the class Q but the right-hand-sides of the equations are uniform and independent random bits. The reduction from refutation to PAC learning can be viewed as an abstraction of (part of) the work of Daniely, Linial, and Shalev-Schwartz (STOC 2014). The converse, however, is new, and is based on a combination of techniques from pseudorandomness (Yao â82) with boosting (Schapire â90). In addition, we show that PAC-learning the class of DNF formulas is polynomially equivalent to PAC-learning its dual class DNFâ , and thus PAC-learning DNF is equivalent to RRHS-refutation of DNF, suggesting an avenue to obtain stronger lower bounds for PAC-learning DNF than the quasipolynomial lower bound that was obtained by Daniely and Shalev-Schwartz (COLT 2016) assuming the hardness of refuting k-SAT.Engineering and Applied Science
Sum of squares lower bounds for refuting any CSP
Let be a nontrivial -ary predicate. Consider a
random instance of the constraint satisfaction problem on
variables with constraints, each being applied to randomly
chosen literals. Provided the constraint density satisfies , such
an instance is unsatisfiable with high probability. The \emph{refutation}
problem is to efficiently find a proof of unsatisfiability.
We show that whenever the predicate supports a -\emph{wise uniform}
probability distribution on its satisfying assignments, the sum of squares
(SOS) algorithm of degree
(which runs in time ) \emph{cannot} refute a random instance of
. In particular, the polynomial-time SOS algorithm requires
constraints to refute random instances of
CSP when supports a -wise uniform distribution on its satisfying
assignments. Together with recent work of Lee et al. [LRS15], our result also
implies that \emph{any} polynomial-size semidefinite programming relaxation for
refutation requires at least constraints.
Our results (which also extend with no change to CSPs over larger alphabets)
subsume all previously known lower bounds for semialgebraic refutation of
random CSPs. For every constraint predicate~, they give a three-way hardness
tradeoff between the density of constraints, the SOS degree (hence running
time), and the strength of the refutation. By recent algorithmic results of
Allen et al. [AOW15] and Raghavendra et al. [RRS16], this full three-way
tradeoff is \emph{tight}, up to lower-order factors.Comment: 39 pages, 1 figur
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
Distilling Information Reliability and Source Trustworthiness from Digital Traces
Online knowledge repositories typically rely on their users or dedicated
editors to evaluate the reliability of their content. These evaluations can be
viewed as noisy measurements of both information reliability and information
source trustworthiness. Can we leverage these noisy evaluations, often biased,
to distill a robust, unbiased and interpretable measure of both notions?
In this paper, we argue that the temporal traces left by these noisy
evaluations give cues on the reliability of the information and the
trustworthiness of the sources. Then, we propose a temporal point process
modeling framework that links these temporal traces to robust, unbiased and
interpretable notions of information reliability and source trustworthiness.
Furthermore, we develop an efficient convex optimization procedure to learn the
parameters of the model from historical traces. Experiments on real-world data
gathered from Wikipedia and Stack Overflow show that our modeling framework
accurately predicts evaluation events, provides an interpretable measure of
information reliability and source trustworthiness, and yields interesting
insights about real-world events.Comment: Accepted at 26th World Wide Web conference (WWW-17
- âŠ