thesis

Domain adaptation with minimal training

Abstract

The performance of a machine learning model trained on labeled data of a (source) domain degrades severely when they are tested on a different (target) domain. Traditional approaches deal with this problem by training a new model for every target domain. In natural language processing, top performing systems often use multiple interconnected models; therefore training all of them for every target domain is computationally expensive. Moreover, retraining the model for the target domain requires access to the labeled data from the source domain which may not be available to end users due to copyright issues. This thesis is a study on how to adapt to a target domain, using the system trained on source domain and avoiding the cost of retraining and the need for access to the source labeled data. This thesis identifies two key ingredients for adaptation without training: broad coverage resources and constraints. We show how resources like Wikipedia, VerbNet and WordNet that contain comprehensive coverage of entities, semantic roles and words in English can help a model adapt to the target domain. For the task of semantic role labeling, we show that in the decision phase, we can replace a linguistic unit (e.g. verb, word) with another equivalent linguistic unit residing in the same cluster defined in these resources (e.g. VerbNet, WordNet) such that after replacement, text becomes more like text on which the model was trained. We show that the model's output is more accurate on the transformed text than on original text. In another instance, we show how to use a system for linking mentions to Wikipedia concepts for adaptation of a named entity recognition system. Since Wikipedia has a broad domain coverage, the linking system is robust across domain variations. Therefore, jointly performing entity recognition and linking improves the accuracy of entity recognition on the target domain without requiring training of a new system for the new domain. In all cases, we show how to use intuitive constraints to guide the model into making coherent predictions. We show how incorporating prior knowledge about a new domain as declarative constraints into the decision phase can improve performance of a model on the new domain. When such prior knowledge is unavailable, we show how to acquire knowledge automatically from unlabeled text from the new domain and domains similar to both source and target domains

    Similar works