Search CORE

15,322 research outputs found

Recommended from our members

On Learning vs. Refutation

Author: Vadhan Salil P.
Publication venue
Publication date: 27/11/2017
Field of study

Building on the work of Daniely et al. (STOC 2014, COLT 2016), we study the connection between computationally efficient PAC learning and refutation of constraint satisfaction problems. Specifically, we prove that for every concept class P, PAC-learning P is polynomially equivalent to “random-right-hand-side-refuting” (“RRHS-refuting”) a dual class P ∗ , where RRHS-refutation of a class Q refers to refuting systems of equations where the constraints are (worst-case) functions from the class Q but the right-hand-sides of the equations are uniform and independent random bits. The reduction from refutation to PAC learning can be viewed as an abstraction of (part of) the work of Daniely, Linial, and Shalev-Schwartz (STOC 2014). The converse, however, is new, and is based on a combination of techniques from pseudorandomness (Yao ‘82) with boosting (Schapire ‘90). In addition, we show that PAC-learning the class of DNF formulas is polynomially equivalent to PAC-learning its dual class DNF∗ , and thus PAC-learning DNF is equivalent to RRHS-refutation of DNF, suggesting an avenue to obtain stronger lower bounds for PAC-learning DNF than the quasipolynomial lower bound that was obtained by Daniely and Shalev-Schwartz (COLT 2016) assuming the hardness of refuting k-SAT.Engineering and Applied Science

Harvard University - DASH

Sum of squares lower bounds for refuting any CSP

Author: Alekhnovich Michael
Barak Boaz
Ben-Sasson Eli
Ben-Sasson Eli
Daniely Amit
Diaz Josep
Friedman Joel
Gableske Oliver
Goldreich Oded
Laurent Monique
Mori Ryuhei
Mossel Elchanan
O’Donnell Ryan
Publication venue
Publication date: 16/01/2017
Field of study

Let

P:\{0,1\}^k \to \{0,1\}

be a nontrivial

k

-ary predicate. Consider a random instance of the constraint satisfaction problem

\mathrm{CSP}(P)

n

variables with

\Delta n

constraints, each being

P

applied to

k

randomly chosen literals. Provided the constraint density satisfies

\Delta \gg 1

, such an instance is unsatisfiable with high probability. The \emph{refutation} problem is to efficiently find a proof of unsatisfiability. We show that whenever the predicate

P

supports a

t

-\emph{wise uniform} probability distribution on its satisfying assignments, the sum of squares (SOS) algorithm of degree

d = \Theta(\frac{n}{\Delta^{2/(t-1)} \log \Delta})

(which runs in time

n^{O(d)}

) \emph{cannot} refute a random instance of

\mathrm{CSP}(P)

. In particular, the polynomial-time SOS algorithm requires

\widetilde{\Omega}(n^{(t+1)/2})

constraints to refute random instances of CSP

(P)

when

P

supports a

t

-wise uniform distribution on its satisfying assignments. Together with recent work of Lee et al. [LRS15], our result also implies that \emph{any} polynomial-size semidefinite programming relaxation for refutation requires at least

\widetilde{\Omega}(n^{(t+1)/2})

constraints. Our results (which also extend with no change to CSPs over larger alphabets) subsume all previously known lower bounds for semialgebraic refutation of random CSPs. For every constraint predicate~

P

, they give a three-way hardness tradeoff between the density of constraints, the SOS degree (hence running time), and the strength of the refutation. By recent algorithmic results of Allen et al. [AOW15] and Raghavendra et al. [RRS16], this full three-way tradeoff is \emph{tight}, up to lower-order factors.Comment: 39 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Argumentation Mining in User-Generated Web Discourse

Author: Gurevych Iryna
Habernal Ivan
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2015
Field of study

The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

arXiv.org e-Print Archive

TUbiblio

Crossref

Directory of Open Access Journals

TUdatalib Repository (TU Darmstadt)

Distilling Information Reliability and Source Trustworthiness from Digital Traces

Author: Aalen O.
Daneshmand H.
De A.
Diamond S.
Du N.
Farajtabar M.
Farajtabar M.
Farajtabar M.
Gomez-Rodriguez M.
Gyöngyi Z.
Hunter D.
Liu X.
Wu M.
Zhao B.
Zhou K.
Řehůřek R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Online knowledge repositories typically rely on their users or dedicated editors to evaluate the reliability of their content. These evaluations can be viewed as noisy measurements of both information reliability and information source trustworthiness. Can we leverage these noisy evaluations, often biased, to distill a robust, unbiased and interpretable measure of both notions? In this paper, we argue that the temporal traces left by these noisy evaluations give cues on the reliability of the information and the trustworthiness of the sources. Then, we propose a temporal point process modeling framework that links these temporal traces to robust, unbiased and interpretable notions of information reliability and source trustworthiness. Furthermore, we develop an efficient convex optimization procedure to learn the parameters of the model from historical traces. Experiments on real-world data gathered from Wikipedia and Stack Overflow show that our modeling framework accurately predicts evaluation events, provides an interpretable measure of information reliability and source trustworthiness, and yields interesting insights about real-world events.Comment: Accepted at 26th World Wide Web conference (WWW-17

arXiv.org e-Print Archive

Crossref

CISPA – Helmholtz-Zentrum für Informationssicherheit