4 research outputs found
Recommended from our members
On Building Generalizable Learning Agents
It has been a long-standing goal in Artificial Intelligence (AI) to build machines that can solve tasks that humans can. Thanks to the recent rapid progress in data-driven methods, which train agents to solve tasks by learning from massive training data, there have been many successes in applying such learning approaches to handle and even solve a number of extremely challenging tasks, including image classification, language generation, robotics control, and several multi-player games. The key factor for all these data-driven successes is that the trained agents can generalize to test scenarios that are unseen during training. This generalization capability is the foundation for building any practical AI system. This thesis studies generalization, the fundamental challenge in AI, and proposes solutions to improve the generalization performances of learning agents in a variety of problems. We start by providing a formal formulation of the generalization problem in the context of reinforcement learning and proposing 4 principles within this formulation to guide the design of training techniques for improved generalization. We validate the effectiveness of our proposed principles by considering 4 different domains, from simple to complex, and developing domain-specific techniques following these principles. Particularly, we begin with the simplest domain, i.e., path-finding on graphs (Part I), and then consider visual navigation in a 3D world (Part II) and competition in complex multi-agent games (Part III), and lastly tackle some natural language processing tasks (Part IV). Empirical evidences demonstrate that the proposed principles can generally lead to much improved generalization performances in a wide range of problems
Part-of-speech tagging with antagonistic adversaries
Supervised NLP tools and on-line services are often used on data that is very different from the manually annotated data used during development. The performance loss observed in such cross-domain applications is often attributed to covariate shifts, with out-of-vocabulary effects as an important subclass. Many discriminative learning algorithms are sensitive to such shifts because highly indicative features may swamp other indicative features. Regularized and adversarial learning algorithms have been proposed to be more robust against covariate shifts. We present a new perceptron learning algorithm using antagonistic adversaries and compare it to previous proposals on 12 multilingual cross-domain part-of-speech tagging datasets. While previous approaches do not improve on our supervised baseline, our approach is better across the board with an average 4 % error reduction.