Search CORE

5 research outputs found

Pareto-Path Multi-Task Multiple Kernel Learning

Author: Anagnostopoulos Georgios C.
Georgiopoulos Michael
Li Cong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/04/2014
Field of study

A traditional and intuitively appealing Multi-Task Multiple Kernel Learning (MT-MKL) method is to optimize the sum (thus, the average) of objective functions with (partially) shared kernel function, which allows information sharing amongst tasks. We point out that the obtained solution corresponds to a single point on the Pareto Front (PF) of a Multi-Objective Optimization (MOO) problem, which considers the concurrent optimization of all task objectives involved in the Multi-Task Learning (MTL) problem. Motivated by this last observation and arguing that the former approach is heuristic, we propose a novel Support Vector Machine (SVM) MT-MKL framework, that considers an implicitly-defined set of conic combinations of task objectives. We show that solving our framework produces solutions along a path on the aforementioned PF and that it subsumes the optimization of the average of objective functions as a special case. Using algorithms we derived, we demonstrate through a series of experimental results that the framework is capable of achieving better classification performance, when compared to other similar MTL approaches.Comment: Accepted by IEEE Transactions on Neural Networks and Learning System

arXiv.org e-Print Archive

CiteSeerX

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

IST Austria Thesis

Author: Pentina Anastasia
Publication venue: IST Austria
Publication date: 01/01/2016
Field of study

Traditionally machine learning has been focusing on the problem of solving a single task in isolation. While being quite well understood, this approach disregards an important aspect of human learning: when facing a new problem, humans are able to exploit knowledge acquired from previously learned tasks. Intuitively, access to several problems simultaneously or sequentially could also be advantageous for a machine learning system, especially if these tasks are closely related. Indeed, results of many empirical studies have provided justification for this intuition. However, theoretical justifications of this idea are rather limited. The focus of this thesis is to expand the understanding of potential benefits of information transfer between several related learning problems. We provide theoretical analysis for three scenarios of multi-task learning - multiple kernel learning, sequential learning and active task selection. We also provide a PAC-Bayesian perspective on lifelong learning and investigate how the task generation process influences the generalization guarantees in this scenario. In addition, we show how some of the obtained theoretical results can be used to derive principled multi-task and lifelong learning algorithms and illustrate their performance on various synthetic and real-world datasets

IST Austria: PubRep (Institute of Science and Technology)

On Kernel-base Multi-Task Learning

Author: Li Cong
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2014
Field of study

Multi-Task Learning (MTL) has been an active research area in machine learning for two decades. By training multiple relevant tasks simultaneously with information shared across tasks, it is possible to improve the generalization performance of each task, compared to training each individual task independently. During the past decade, most MTL research has been based on the Regularization-Loss framework due to its flexibility in specifying various types of information sharing strategies, the opportunity it offers to yield a kernel-based methods and its capability in promoting sparse feature representations. However, certain limitations exist in both theoretical and practical aspects of Regularization-Loss-based MTL. Theoretically, previous research on generalization bounds in connection to MTL Hypothesis Space (HS)s, where data of all tasks are pre-processed by a (partially) common operator, has been limited in two aspects: First, all previous works assumed linearity of the operator, therefore completely excluding kernel-based MTL HSs, for which the operator is potentially non-linear. Secondly, all previous works, rather unnecessarily, assumed that all the task weights to be constrained within norm-balls, whose radii are equal. The requirement of equal radii leads to significant inflexibility of the relevant HSs, which may cause the generalization performance of the corresponding MTL models to deteriorate. Practically, various algorithms have been developed for kernel-based MTL models, due to different characteristics of the formulations. Most of these algorithms are a burden to develop and end up being quite sophisticated, so that practitioners may face a hard task in interpreting and implementing them, especially when multiple models are involved. This is even more so, when Multi-Task Multiple Kernel Learning (MT-MKL) models are considered. This research largely resolves the above limitations. Theoretically, a pair of new kernel-based HSs are proposed: one for single-kernel MTL, and another one for MT-MKL. Unlike previous works, we allow each task weight to be constrained within a norm-ball, whose radius is learned during training. By deriving and analyzing the generalization bounds of these two HSs, we show that, indeed, such a flexibility leads to much tighter generalization bounds, which often results to significantly better generalization performance. Based on this observation, a pair of new models is developed, one for each case: single-kernel MTL, and another one for MT-MKL. From a practical perspective, we propose a general MT-MKL framework that covers most of the prominent MT-MKL approaches, including our new MT-MKL formulation. Then, a general purpose algorithm is developed to solve the framework, which can also be employed for training all other models subsumed by this framework. A series of experiments is conducted to assess the merits of the proposed mode when trained by the new algorithm. Certain properties of our HSs and formulations are demonstrated, and the advantage of our model in terms of classification accuracy is shown via these experiments

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)