4,765 research outputs found
The Consistency dimension and distribution-dependent learning from queries
We prove a new combinatorial characterization of polynomial
learnability from equivalence queries, and state some of its
consequences relating the learnability of a class with the
learnability via equivalence and membership queries of its
subclasses obtained by restricting the instance space.
Then we propose and study two models of query learning in which there
is a probability distribution on the instance space, both as an
application of the tools developed from the combinatorial
characterization and as models of independent interest.Postprint (published version
A Survey of Quantum Learning Theory
This paper surveys quantum learning theory: the theoretical aspects of
machine learning using quantum computers. We describe the main results known
for three models of learning: exact learning from membership queries, and
Probably Approximately Correct (PAC) and agnostic learning from classical or
quantum examples.Comment: 26 pages LaTeX. v2: many small changes to improve the presentation.
This version will appear as Complexity Theory Column in SIGACT News in June
2017. v3: fixed a small ambiguity in the definition of gamma(C) and updated a
referenc
Indexability, concentration, and VC theory
Degrading performance of indexing schemes for exact similarity search in high
dimensions has long since been linked to histograms of distributions of
distances and other 1-Lipschitz functions getting concentrated. We discuss this
observation in the framework of the phenomenon of concentration of measure on
the structures of high dimension and the Vapnik-Chervonenkis theory of
statistical learning.Comment: 17 pages, final submission to J. Discrete Algorithms (an expanded,
improved and corrected version of the SISAP'2010 invited paper, this e-print,
v3
Recommended from our members
Local search: A guide for the information retrieval practitioner
There are a number of combinatorial optimisation problems in information retrieval in which the use of local search methods are worthwhile. The purpose of this paper is to show how local search can be used to solve some well known tasks in information retrieval (IR), how previous research in the field is piecemeal, bereft of a structure and methodologically flawed, and to suggest more rigorous ways of applying local search methods to solve IR problems. We provide a query based taxonomy for analysing the use of local search in IR tasks and an overview of issues such as fitness functions, statistical significance and test collections when conducting experiments on combinatorial optimisation problems. The paper gives a guide on the pitfalls and problems for IR practitioners who wish to use local search to solve their research issues, and gives practical advice on the use of such methods. The query based taxonomy is a novel structure which can be used by the IR practitioner in order to examine the use of local search in IR
Exploiting Metric Structure for Efficient Private Query Release
We consider the problem of privately answering queries defined on databases
which are collections of points belonging to some metric space. We give simple,
computationally efficient algorithms for answering distance queries defined
over an arbitrary metric. Distance queries are specified by points in the
metric space, and ask for the average distance from the query point to the
points contained in the database, according to the specified metric. Our
algorithms run efficiently in the database size and the dimension of the space,
and operate in both the online query release setting, and the offline setting
in which they must in polynomial time generate a fixed data structure which can
answer all queries of interest. This represents one of the first subclasses of
linear queries for which efficient algorithms are known for the private query
release problem, circumventing known hardness results for generic linear
queries
Lower Bounds on the Oracle Complexity of Nonsmooth Convex Optimization via Information Theory
We present an information-theoretic approach to lower bound the oracle
complexity of nonsmooth black box convex optimization, unifying previous lower
bounding techniques by identifying a combinatorial problem, namely string
guessing, as a single source of hardness. As a measure of complexity we use
distributional oracle complexity, which subsumes randomized oracle complexity
as well as worst-case oracle complexity. We obtain strong lower bounds on
distributional oracle complexity for the box , as well as for the
-ball for (for both low-scale and large-scale regimes),
matching worst-case upper bounds, and hence we close the gap between
distributional complexity, and in particular, randomized complexity, and
worst-case complexity. Furthermore, the bounds remain essentially the same for
high-probability and bounded-error oracle complexity, and even for combination
of the two, i.e., bounded-error high-probability oracle complexity. This
considerably extends the applicability of known bounds
- …