Search CORE

201,198 research outputs found

Teaching Logic for Computer Science: Are We Teaching the Wrong Narrative?

Author: Makowsky Johann
Publication venue
Publication date: 13/07/2015
Field of study

In this paper I discuss what, according to my long experience, every computer scientist should know from logic. We concentrate on issues of modeling, interpretability and levels of abstraction. We discuss what the minimal toolbox of logic tools should look like for a computer scientist who is involved in designing and analyzing reliable systems. We shall conclude that many classical topics dear to logicians are less important than usually presented, and that less-known ideas from logic may be more useful for the working computer scientist.Comment: Proceedings of the Fourth International Conference on Tools for Teaching Logic (TTL2015), Rennes, France, June 9-12, 2015. Editors: M. Antonia Huertas, Jo\~ao Marcos, Mar\'ia Manzano, Sophie Pinchinat, Fran\c{c}ois Schwarzentrube

arXiv.org e-Print Archive

On the minimal teaching sets of two-dimensional threshold functions

Author: Alekseyev Max A.
Basova Marina G.
Zolotykh Nikolai Yu.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 19/07/2014
Field of study

It is known that a minimal teaching set of any threshold function on the twodimensional rectangular grid consists of 3 or 4 points. We derive exact formulae for the numbers of functions corresponding to these values and further refine them in the case of a minimal teaching set of size 3. We also prove that the average cardinality of the minimal teaching sets of threshold functions is asymptotically 7/2. We further present corollaries of these results concerning some special arrangements of lines in the plane.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

Specifying a positive threshold function via extremal points

Author: Lozin Vadim
Razgon Igor
Zamaraev Viktor
Zamaraeva Elena
Zolotykh Nikolai Yu.
Publication venue
Publication date: 06/06/2017
Field of study

An extremal point of a positive threshold Boolean function

f

is either a maximal zero or a minimal one. It is known that if

f

depends on all its variables, then the set of its extremal points completely specifies

f

within the universe of threshold functions. However, in some cases,

f

can be specified by a smaller set. The minimum number of points in such a set is the specification number of

f

. It was shown in [S.-T. Hu. Threshold Logic, 1965] that the specification number of a threshold function of

n

variables is at least

n+1

. In [M. Anthony, G. Brightwell, and J. Shawe-Taylor. On specifying Boolean functions by labelled examples. Discrete Applied Mathematics, 1995] it was proved that this bound is attained for nested functions and conjectured that for all other threshold functions the specification number is strictly greater than

n+1

. In the present paper, we resolve this conjecture negatively by exhibiting threshold Boolean functions of

n

variables, which are non-nested and for which the specification number is

n+1

. On the other hand, we show that the set of extremal points satisfies the statement of the conjecture, i.e., a positive threshold Boolean function depending on all its

n

variables has

n+1

extremal points if and only if it is nested. To prove this, we reveal an underlying structure of the set of extremal points

arXiv.org e-Print Archive

Teaching and compressing for low VC-dimension

Author: Moran Shay
Shpilka Amir
Wigderson Avi
Yehudayoff Amir
Publication venue
Publication date: 23/11/2016
Field of study

In this work we study the quantitative relation between VC-dimension and two other basic parameters related to learning and teaching. Namely, the quality of sample compression schemes and of teaching sets for classes of low VC-dimension. Let

C

be a binary concept class of size

m

and VC-dimension

d

. Prior to this work, the best known upper bounds for both parameters were

\log(m)

, while the best lower bounds are linear in

d

. We present significantly better upper bounds on both as follows. Set

k = O(d 2^d \log \log |C|)

. We show that there always exists a concept

c

C

with a teaching set (i.e. a list of

c

-labeled examples uniquely identifying

c

C

) of size

k

. This problem was studied by Kuhlmann (1999). Our construction implies that the recursive teaching (RT) dimension of

C

is at most

k

as well. The RT-dimension was suggested by Zilles et al. and Doliwa et al. (2010). The same notion (under the name partial-ID width) was independently studied by Wigderson and Yehudayoff (2013). An upper bound on this parameter that depends only on

d

is known just for the very simple case

d=1

, and is open even for

d=2

. We also make small progress towards this seemingly modest goal. We further construct sample compression schemes of size

k

for

C

, with additional information of

k \log(k)

bits. Roughly speaking, given any list of

C

-labelled examples of arbitrary length, we can retain only

k

labeled examples in a way that allows to recover the labels of all others examples in the list, using additional

k\log (k)

information bits. This problem was first suggested by Littlestone and Warmuth (1986).Comment: The final version is due to be published in the collection of papers "A Journey through Discrete Mathematics. A Tribute to Jiri Matousek" edited by Martin Loebl, Jaroslav Nesetril and Robin Thomas, due to be published by Springe

arXiv.org e-Print Archive

The Teaching Dimension of Linear Learners

Author: Liu Ji
Zhu Xiaojin
Publication venue
Publication date: 07/12/2015
Field of study

Teaching dimension is a learning theoretic quantity that specifies the minimum training set size to teach a target model to a learner. Previous studies on teaching dimension focused on version-space learners which maintain all hypotheses consistent with the training data, and cannot be applied to modern machine learners which select a specific hypothesis via optimization. This paper presents the first known teaching dimension for ridge regression, support vector machines, and logistic regression. We also exhibit optimal training sets that match these teaching dimensions. Our approach generalizes to other linear learners

arXiv.org e-Print Archive

Target Curricula via Selection of Minimum Feature Sets: a Case Study in Boolean Networks

Author: Fenn Shannon
Moscato Pablo
Publication venue
Publication date: 31/10/2017
Field of study

We consider the effect of introducing a curriculum of targets when training Boolean models on supervised Multi Label Classification (MLC) problems. In particular, we consider how to order targets in the absence of prior knowledge, and how such a curriculum may be enforced when using meta-heuristics to train discrete non-linear models. We show that hierarchical dependencies between targets can be exploited by enforcing an appropriate curriculum using hierarchical loss functions. On several multi output circuit-inference problems with known target difficulties, Feedforward Boolean Networks (FBNs) trained with such a loss function achieve significantly lower out-of-sample error, up to

10\%

in some cases. This improvement increases as the loss places more emphasis on target order and is strongly correlated with an easy-to-hard curricula. We also demonstrate the same improvements on three real-world models and two Gene Regulatory Network (GRN) inference problems. We posit a simple a-priori method for identifying an appropriate target order and estimating the strength of target relationships in Boolean MLCs. These methods use intrinsic dimension as a proxy for target difficulty, which is estimated using optimal solutions to a combinatorial optimisation problem known as the Minimum-Feature-Set (minFS) problem. We also demonstrate that the same generalisation gains can be achieved without providing any knowledge of target difficulty.Comment: Accepted for publication in JMLR issue 1

arXiv.org e-Print Archive

When are epsilon-nets small?

Author: Kupavskii Andrey
Zhivotovskiy Nikita
Publication venue: 'Elsevier BV'
Publication date: 06/02/2020
Field of study

In many interesting situations the size of epsilon-nets depends only on

\epsilon

together with different complexity measures. The aim of this paper is to give a systematic treatment of such complexity measures arising in Discrete and Computational Geometry and Statistical Learning, and to bridge the gap between the results appearing in these two fields. As a byproduct, we obtain several new upper bounds on the sizes of epsilon-nets that generalize/improve the best known general guarantees. In particular, our results work with regimes when small epsilon-nets of size

o(\frac{1}{\epsilon})

exist, which are not usually covered by standard upper bounds. Inspired by results in Statistical Learning we also give a short proof of the Haussler's upper bound on packing numbers.Comment: 22 pages; minor changes, accepted versio

arXiv.org e-Print Archive

Collaborative and Privacy-Preserving Machine Teaching via Consensus Optimization

Author: Gates Christopher
Han Yufei
Ma Yuzhe
Roundy Kevin
Shen Yun
Publication venue
Publication date: 07/05/2019
Field of study

In this work, we define a collaborative and privacy-preserving machine teaching paradigm with multiple distributed teachers. We focus on consensus super teaching. It aims at organizing distributed teachers to jointly select a compact while informative training subset from data hosted by the teachers to make a learner learn better. The challenges arise from three perspectives. First, the state-of-the-art pool-based super teaching method applies mixed-integer non-linear programming (MINLP) which does not scale well to very large data sets. Second, it is desirable to restrict data access of the teachers to only their own data during the collaboration stage to mitigate privacy leaks. Finally, the teaching collaboration should be communication-efficient since large communication overheads can cause synchronization delays between teachers. To address these challenges, we formulate collaborative teaching as a consensus and privacy-preserving optimization process to minimize teaching risk. We theoretically demonstrate the necessity of collaboration between teachers for improving the learner's learning. Furthermore, we show that the proposed method enjoys a similar property as the Oracle property of adaptive Lasso. The empirical study illustrates that our teaching method can deliver significantly more accurate teaching results with high speed, while the non-collaborative MINLP-based super teaching becomes prohibitively expensive to compute

arXiv.org e-Print Archive

On the number of irreducible points in polyhedra

Author: Chirkov A. Yu.
Zolotykh N. Yu.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/08/2014
Field of study

An integer point in a polyhedron is called irreducible iff it is not the midpoint of two other integer points in the polyhedron. We prove that the number of irreducible integer points in

n

-dimensional polytope with radius

k

given by a system of

m

linear inequalities is at most

O(m^{\lfloor\frac{n}{2}\rfloor}\log^{n-1} k)

n

is fixed. Using this result we prove the hypothesis asserting that the teaching dimension in the class of threshold functions of

k

-valued logic in

n

variables is

\Theta(\log^{n-2} k)

for any fixed

n\ge 2

.Comment: 24 pages, 4 figure

arXiv.org e-Print Archive

Finite Biased Teaching with Infinite Concept Classes

Author: Hernandez-Orallo Jose
Telle Jan Arne
Publication venue
Publication date: 19/04/2018
Field of study

We investigate the teaching of infinite concept classes through the effect of the learning bias (which is used by the learner to prefer some concepts over others and by the teacher to devise the teaching examples) and the sampling bias (which determines how the concepts are sampled from the class). We analyse two important classes: Turing machines and finite-state machines. We derive bounds for the biased teaching dimension when the learning bias is derived from a complexity measure (Kolmogorov complexity and minimal number of states respectively) and analyse the sampling distributions that lead to finite expected biased teaching dimensions. We highlight the existing trade-off between the bound and the representativeness of the sample, and its implications for the understanding of what teaching rich concepts to machines entails

arXiv.org e-Print Archive