701 research outputs found

    A review of domain adaptation without target labels

    Full text link
    Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.Comment: 20 pages, 5 figure

    On the Generalization of the C-Bound to Structured Output Ensemble Methods

    No full text
    This paper generalizes an important result from the PAC-Bayesian literature for binary classification to the case of ensemble methods for structured outputs. We prove a generic version of the \Cbound, an upper bound over the risk of models expressed as a weighted majority vote that is based on the first and second statistical moments of the vote's margin. This bound may advantageously (i)(i) be applied on more complex outputs such as multiclass labels and multilabel, and (ii)(ii) allow to consider margin relaxations. These results open the way to develop new ensemble methods for structured output prediction with PAC-Bayesian guarantees

    Wide stochastic networks: Gaussian limit and PAC-Bayesian training

    Full text link
    The limit of infinite width allows for substantial simplifications in the analytical study of overparameterized neural networks. With a suitable random initialization, an extremely large network is well approximated by a Gaussian process, both before and during training. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimizes the generalization bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC-Bayesian methods.Comment: 20 pages, 2 figure

    Confusion-Based Online Learning and a Passive-Aggressive Scheme

    No full text
    International audienceThis paper provides the first ---to the best of our knowledge--- analysis of online learning algorithms for multiclass problems when the {\em confusion} matrix is taken as a performance measure. The work builds upon recent and elegant results on noncommutative concentration inequalities, i.e. concentration inequalities that apply to matrices, and, more precisely, to matrix martingales. We do establish generalization bounds for online learning algorithms and show how the theoretical study motivates the proposition of a new confusion-friendly learning procedure. This learning algorithm, called \copa (for COnfusion Passive-Aggressive) is a passive-aggressive learning algorithm; it is shown that the update equations for \copa can be computed analytically and, henceforth, there is no need to recourse to any optimization package to implement it

    Controlling Confusion via Generalisation Bounds

    Get PDF
    31 pagesWe establish new generalisation bounds for multiclass classification by abstracting to a more general setting of discretised error types. Extending the PAC-Bayes theory, we are hence able to provide fine-grained bounds on performance for multiclass classification, as well as applications to other learning problems including discretisation of regression losses. Tractable training objectives are derived from the bounds. The bounds are uniform over all weightings of the discretised error types and thus can be used to bound weightings not foreseen at training, including the full confusion matrix in the multiclass classification case

    Automated Design of Neural Network Architecture for Classification

    Get PDF
    This Ph.D. thesis deals with finding a good architecture of a neural network classifier. The focus is on methods to improve the performance of existing architectures (i.e. architectures that are initialised by a good academic guess) and automatically building neural networks. An introduction to the Multi-Layer feed-forward neural network is given and the most essential properties for neural networks; there ability to learn from examples is discussion. Topics like traning and generalisation are treated in more explicit. On the basic of this dissuscion methods for finding a good architecture of the network described. This includes methods like; Early stopping, Cross validation, Regularisation, Pruning and various constructions algorithms (methods that successively builds a network). New ideas of combining units with different types of transfer functions like radial basis functions and sigmoid or threshold functions led to the development of a new construction algorithm for classification. The algorithm called "GLOCAL" is fully described. Results from these experiments real life data from a Synthetic Aperture Radar (SAR) are provided.The thesis was written so people from the industry and graduate students who are interested in neural networks hopeful would find it useful.Key words: Neural networks, Architectures, Training, Generalisation deductive and construction algorithms
    • …
    corecore