Search CORE

445 research outputs found

IST Austria Thesis

Author: Bui Thi Mai Phuong
Publication venue: IST Austria
Publication date: 01/01/2021
Field of study

Deep learning is best known for its empirical success across a wide range of applications spanning computer vision, natural language processing and speech. Of equal significance, though perhaps less known, are its ramifications for learning theory: deep networks have been observed to perform surprisingly well in the high-capacity regime, aka the overfitting or underspecified regime. Classically, this regime on the far right of the bias-variance curve is associated with poor generalisation; however, recent experiments with deep networks challenge this view. This thesis is devoted to investigating various aspects of underspecification in deep learning. First, we argue that deep learning models are underspecified on two levels: a) any given training dataset can be fit by many different functions, and b) any given function can be expressed by many different parameter configurations. We refer to the second kind of underspecification as parameterisation redundancy and we precisely characterise its extent. Second, we characterise the implicit criteria (the inductive bias) that guide learning in the underspecified regime. Specifically, we consider a nonlinear but tractable classification setting, and show that given the choice, neural networks learn classifiers with a large margin. Third, we consider learning scenarios where the inductive bias is not by itself sufficient to deal with underspecification. We then study different ways of ‘tightening the specification’: i) In the setting of representation learning with variational autoencoders, we propose a hand- crafted regulariser based on mutual information. ii) In the setting of binary classification, we consider soft-label (real-valued) supervision. We derive a generalisation bound for linear networks supervised in this way and verify that soft labels facilitate fast learning. Finally, we explore an application of soft-label supervision to the training of multi-exit models

IST Austria: PubRep (Institute of Science and Technology)

The Shallow and the Deep:A biased introduction to neural networks and old school machine learning

Author: Biehl Michael
Publication venue: Rijksuniversiteit Groningen
Publication date: 25/03/2022
Field of study

The Shallow and the Deep is a collection of lecture notes that offers an accessible introduction to neural networks and machine learning in general. However, it was clear from the beginning that these notes would not be able to cover this rapidly changing and growing field in its entirety. The focus lies on classical machine learning techniques, with a bias towards classification and regression. Other learning paradigms and many recent developments in, for instance, Deep Learning are not addressed or only briefly touched upon.Biehl argues that having a solid knowledge of the foundations of the field is essential, especially for anyone who wants to explore the world of machine learning with an ambition that goes beyond the application of some software package to some data set. Therefore, The Shallow and the Deep places emphasis on fundamental concepts and theoretical background. This also involves delving into the history and pre-history of neural networks, where the foundations for most of the recent developments were laid. These notes aim to demystify machine learning and neural networks without losing the appreciation for their impressive power and versatility

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Supervised Learning - An Introduction:Lectures given at the 30th Canary Islands Winter School of Astrophysics

Author: Biehl Michael
Publication venue: Machine Learning Reports
Publication date: 01/04/2019
Field of study

Proceedings - University of Groningen

Support Vector Machines for Business Applications

Author: Lovell Brian C.
Walder Christian J.
Publication venue: 'IGI Global'
Publication date: 01/01/2005
Field of study

This chapter discusses the usage of Support Vector Machines (SVM) for business applications. It provides a brief historical background on inductive learning and pattern recognition, and then an intuitive motivation for SVM methods. The method is compared to other approaches, and the tools and background theory required to successfully apply SVMs to business applications are introduced. The authors hope that the chapter will help practitioners to understand when the SVM should be the method of choice, as well as how to achieve good results in minimal time

CiteSeerX

Crossref

University of Queensland eSpace