201,198 research outputs found
Teaching Logic for Computer Science: Are We Teaching the Wrong Narrative?
In this paper I discuss what, according to my long experience, every computer
scientist should know from logic. We concentrate on issues of modeling,
interpretability and levels of abstraction. We discuss what the minimal toolbox
of logic tools should look like for a computer scientist who is involved in
designing and analyzing reliable systems. We shall conclude that many classical
topics dear to logicians are less important than usually presented, and that
less-known ideas from logic may be more useful for the working computer
scientist.Comment: Proceedings of the Fourth International Conference on Tools for
Teaching Logic (TTL2015), Rennes, France, June 9-12, 2015. Editors: M.
Antonia Huertas, Jo\~ao Marcos, Mar\'ia Manzano, Sophie Pinchinat,
Fran\c{c}ois Schwarzentrube
On the minimal teaching sets of two-dimensional threshold functions
It is known that a minimal teaching set of any threshold function on the
twodimensional rectangular grid consists of 3 or 4 points. We derive exact
formulae for the numbers of functions corresponding to these values and further
refine them in the case of a minimal teaching set of size 3. We also prove that
the average cardinality of the minimal teaching sets of threshold functions is
asymptotically 7/2.
We further present corollaries of these results concerning some special
arrangements of lines in the plane.Comment: 11 pages, 4 figure
Specifying a positive threshold function via extremal points
An extremal point of a positive threshold Boolean function is either a
maximal zero or a minimal one. It is known that if depends on all its
variables, then the set of its extremal points completely specifies within
the universe of threshold functions. However, in some cases, can be
specified by a smaller set. The minimum number of points in such a set is the
specification number of . It was shown in [S.-T. Hu. Threshold Logic, 1965]
that the specification number of a threshold function of variables is at
least . In [M. Anthony, G. Brightwell, and J. Shawe-Taylor. On specifying
Boolean functions by labelled examples. Discrete Applied Mathematics, 1995] it
was proved that this bound is attained for nested functions and conjectured
that for all other threshold functions the specification number is strictly
greater than . In the present paper, we resolve this conjecture negatively
by exhibiting threshold Boolean functions of variables, which are
non-nested and for which the specification number is . On the other hand,
we show that the set of extremal points satisfies the statement of the
conjecture, i.e., a positive threshold Boolean function depending on all its
variables has extremal points if and only if it is nested. To prove
this, we reveal an underlying structure of the set of extremal points
Teaching and compressing for low VC-dimension
In this work we study the quantitative relation between VC-dimension and two
other basic parameters related to learning and teaching. Namely, the quality of
sample compression schemes and of teaching sets for classes of low
VC-dimension. Let be a binary concept class of size and VC-dimension
. Prior to this work, the best known upper bounds for both parameters were
, while the best lower bounds are linear in . We present
significantly better upper bounds on both as follows. Set .
We show that there always exists a concept in with a teaching set
(i.e. a list of -labeled examples uniquely identifying in ) of size
. This problem was studied by Kuhlmann (1999). Our construction implies that
the recursive teaching (RT) dimension of is at most as well. The
RT-dimension was suggested by Zilles et al. and Doliwa et al. (2010). The same
notion (under the name partial-ID width) was independently studied by Wigderson
and Yehudayoff (2013). An upper bound on this parameter that depends only on
is known just for the very simple case , and is open even for .
We also make small progress towards this seemingly modest goal.
We further construct sample compression schemes of size for , with
additional information of bits. Roughly speaking, given any list of
-labelled examples of arbitrary length, we can retain only labeled
examples in a way that allows to recover the labels of all others examples in
the list, using additional information bits. This problem was first
suggested by Littlestone and Warmuth (1986).Comment: The final version is due to be published in the collection of papers
"A Journey through Discrete Mathematics. A Tribute to Jiri Matousek" edited
by Martin Loebl, Jaroslav Nesetril and Robin Thomas, due to be published by
Springe
The Teaching Dimension of Linear Learners
Teaching dimension is a learning theoretic quantity that specifies the
minimum training set size to teach a target model to a learner. Previous
studies on teaching dimension focused on version-space learners which maintain
all hypotheses consistent with the training data, and cannot be applied to
modern machine learners which select a specific hypothesis via optimization.
This paper presents the first known teaching dimension for ridge regression,
support vector machines, and logistic regression. We also exhibit optimal
training sets that match these teaching dimensions. Our approach generalizes to
other linear learners
Target Curricula via Selection of Minimum Feature Sets: a Case Study in Boolean Networks
We consider the effect of introducing a curriculum of targets when training
Boolean models on supervised Multi Label Classification (MLC) problems. In
particular, we consider how to order targets in the absence of prior knowledge,
and how such a curriculum may be enforced when using meta-heuristics to train
discrete non-linear models.
We show that hierarchical dependencies between targets can be exploited by
enforcing an appropriate curriculum using hierarchical loss functions. On
several multi output circuit-inference problems with known target difficulties,
Feedforward Boolean Networks (FBNs) trained with such a loss function achieve
significantly lower out-of-sample error, up to in some cases. This
improvement increases as the loss places more emphasis on target order and is
strongly correlated with an easy-to-hard curricula. We also demonstrate the
same improvements on three real-world models and two Gene Regulatory Network
(GRN) inference problems.
We posit a simple a-priori method for identifying an appropriate target order
and estimating the strength of target relationships in Boolean MLCs. These
methods use intrinsic dimension as a proxy for target difficulty, which is
estimated using optimal solutions to a combinatorial optimisation problem known
as the Minimum-Feature-Set (minFS) problem. We also demonstrate that the same
generalisation gains can be achieved without providing any knowledge of target
difficulty.Comment: Accepted for publication in JMLR issue 1
When are epsilon-nets small?
In many interesting situations the size of epsilon-nets depends only on
together with different complexity measures. The aim of this paper
is to give a systematic treatment of such complexity measures arising in
Discrete and Computational Geometry and Statistical Learning, and to bridge the
gap between the results appearing in these two fields. As a byproduct, we
obtain several new upper bounds on the sizes of epsilon-nets that
generalize/improve the best known general guarantees. In particular, our
results work with regimes when small epsilon-nets of size
exist, which are not usually covered by standard upper
bounds. Inspired by results in Statistical Learning we also give a short proof
of the Haussler's upper bound on packing numbers.Comment: 22 pages; minor changes, accepted versio
Collaborative and Privacy-Preserving Machine Teaching via Consensus Optimization
In this work, we define a collaborative and privacy-preserving machine
teaching paradigm with multiple distributed teachers. We focus on consensus
super teaching. It aims at organizing distributed teachers to jointly select a
compact while informative training subset from data hosted by the teachers to
make a learner learn better. The challenges arise from three perspectives.
First, the state-of-the-art pool-based super teaching method applies
mixed-integer non-linear programming (MINLP) which does not scale well to very
large data sets. Second, it is desirable to restrict data access of the
teachers to only their own data during the collaboration stage to mitigate
privacy leaks. Finally, the teaching collaboration should be
communication-efficient since large communication overheads can cause
synchronization delays between teachers.
To address these challenges, we formulate collaborative teaching as a
consensus and privacy-preserving optimization process to minimize teaching
risk. We theoretically demonstrate the necessity of collaboration between
teachers for improving the learner's learning. Furthermore, we show that the
proposed method enjoys a similar property as the Oracle property of adaptive
Lasso. The empirical study illustrates that our teaching method can deliver
significantly more accurate teaching results with high speed, while the
non-collaborative MINLP-based super teaching becomes prohibitively expensive to
compute
On the number of irreducible points in polyhedra
An integer point in a polyhedron is called irreducible iff it is not the
midpoint of two other integer points in the polyhedron. We prove that the
number of irreducible integer points in -dimensional polytope with radius
given by a system of linear inequalities is at most
if is fixed. Using this
result we prove the hypothesis asserting that the teaching dimension in the
class of threshold functions of -valued logic in variables is
for any fixed .Comment: 24 pages, 4 figure
Finite Biased Teaching with Infinite Concept Classes
We investigate the teaching of infinite concept classes through the effect of
the learning bias (which is used by the learner to prefer some concepts over
others and by the teacher to devise the teaching examples) and the sampling
bias (which determines how the concepts are sampled from the class). We analyse
two important classes: Turing machines and finite-state machines. We derive
bounds for the biased teaching dimension when the learning bias is derived from
a complexity measure (Kolmogorov complexity and minimal number of states
respectively) and analyse the sampling distributions that lead to finite
expected biased teaching dimensions. We highlight the existing trade-off
between the bound and the representativeness of the sample, and its
implications for the understanding of what teaching rich concepts to machines
entails
- …