154 research outputs found
A Neural Networks Committee for the Contextual Bandit Problem
This paper presents a new contextual bandit algorithm, NeuralBandit, which
does not need hypothesis on stationarity of contexts and rewards. Several
neural networks are trained to modelize the value of rewards knowing the
context. Two variants, based on multi-experts approach, are proposed to choose
online the parameters of multi-layer perceptrons. The proposed algorithms are
successfully tested on a large dataset with and without stationarity of
rewards.Comment: 21st International Conference on Neural Information Processin
Combined optimization algorithms applied to pattern classification
Accurate classification by minimizing the error on test samples is the main
goal in pattern classification. Combinatorial optimization is a well-known
method for solving minimization problems, however, only a few examples of
classifiers axe described in the literature where combinatorial optimization is
used in pattern classification. Recently, there has been a growing interest
in combining classifiers and improving the consensus of results for a greater
accuracy. In the light of the "No Ree Lunch Theorems", we analyse the combination
of simulated annealing, a powerful combinatorial optimization method
that produces high quality results, with the classical perceptron algorithm.
This combination is called LSA machine. Our analysis aims at finding paradigms
for problem-dependent parameter settings that ensure high classifica,
tion results. Our computational experiments on a large number of benchmark
problems lead to results that either outperform or axe at least competitive to
results published in the literature. Apart from paxameter settings, our analysis
focuses on a difficult problem in computation theory, namely the network
complexity problem. The depth vs size problem of neural networks is one of
the hardest problems in theoretical computing, with very little progress over
the past decades. In order to investigate this problem, we introduce a new
recursive learning method for training hidden layers in constant depth circuits.
Our findings make contributions to a) the field of Machine Learning, as the
proposed method is applicable in training feedforward neural networks, and to
b) the field of circuit complexity by proposing an upper bound for the number
of hidden units sufficient to achieve a high classification rate. One of the major
findings of our research is that the size of the network can be bounded by
the input size of the problem and an approximate upper bound of 8 + â2n/n
threshold gates as being sufficient for a small error rate, where n := log/SL
and SL is the training set
On the Relationship Between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions
In this paper, we bound the generalization error of a class of Radial Basis Function networks, for certain well defined function learning tasks, in terms of the number of parameters and number of examples. We show that the total generalization error is partly due to the insufficient representational capacity of the network (because of its finite size) and partly due to insufficient information about the target function (because of finite number of samples). We make several observations about generalization error which are valid irrespective of the approximation scheme. Our result also sheds light on ways to choose an appropriate network architecture for a particular problem
Approximation-Generalization Trade-offs under (Approximate) Group Equivariance
The explicit incorporation of task-specific inductive biases through symmetry
has emerged as a general design precept in the development of high-performance
machine learning models. For example, group equivariant neural networks have
demonstrated impressive performance across various domains and applications
such as protein and drug design. A prevalent intuition about such models is
that the integration of relevant symmetry results in enhanced generalization.
Moreover, it is posited that when the data and/or the model may only exhibit
or symmetry, the optimal or
best-performing model is one where the model symmetry aligns with the data
symmetry. In this paper, we conduct a formal unified investigation of these
intuitions. To begin, we present general quantitative bounds that demonstrate
how models capturing task-specific symmetries lead to improved generalization.
In fact, our results do not require the transformations to be finite or even
form a group and can work with partial or approximate equivariance. Utilizing
this quantification, we examine the more general question of model
mis-specification i.e. when the model symmetries don't align with the data
symmetries. We establish, for a given symmetry group, a quantitative comparison
between the approximate/partial equivariance of the model and that of the data
distribution, precisely connecting model equivariance error and data
equivariance error. Our result delineates conditions under which the model
equivariance error is optimal, thereby yielding the best-performing model for
the given task and data
- âŠ