thesis

Multiple Tasks are Better than One: Multi-task Learning and Feature Selection for Head Pose Estimation, Action Recognition and Event Detection

Abstract

Computer vision is a field that includes methods for acquiring, processing, analyzing, and understanding images and videos and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information. The classical problem in computer vision is that of determining whether or not the image or video data contains some specific object, feature, or activity. This task can normally be solved robustly and without effort by a human, but is still not satisfactorily solved in computer vision for the general case - arbitrary objects in arbitrary situations. The existing methods for dealing with this problem can at best solve it only for specific objects, such as simple geometric objects (e.g., polyhedra), human faces, printed or hand-written characters, or vehicles, and in specific situations, typically described in terms of well-defined illumination, background, and pose of the object relative to the camera. Machine Learning (ML) and Computer Vision (CV) have been put together during the development of computer vision in the past decade. Nowadays, machine learning is considered as a powerful tool to solve many computer vision problems. Multi-task learning, as one important branch of machine learning, has developed very fast during the past decade. Multi-task learning methods aim to simultaneously learn classification or regression models for a set of related tasks. This typically leads to better models as compared to a learner that does not account for task relationships. The goal of multi-task learning is to improve the performance of learning algorithms by learning classifiers for multiple tasks jointly. This works particularly well if these tasks have some commonality and are generally slightly under-sampled

    Similar works